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COMPUTATIONAL METHOD FOR IDENTIFyiNG ADHESIN AND 
ADHESIN-LIKE PROTEINS OF THERAPEUTIC POTENTIAL 

Field of the present Invention 

5 A computational method for identifying adhesin and adhesin-like proteins; computer 
system for performing the method; and genes and proteins encoding adhesin and 
adhesin-like proteins 

BarK-^r.T»i»F^« -lad ^'^"^^ c7 ^1ie preseBt Ir*vc.5ftt?o^ 

The progress in genome sequencing projects has generated a large number of inferred 

10 protein sequences from different orgaaisms. It is expected that the availability of the 
information on the complete set of proteins from infectious human pathogens will 
enable us to develop novel molecular approaches to combat them. A necessary step in 
the successful colonization and subsequent manifestation of disease by microbial 
pathogens is the. ability to adhere to host cells. 

15 Microbial pathogens encode several proteins known as adhesins that mediate their 
adherence to host cell surface receptors, membranes, or extracellular matrix for 
successful colonization. Investigations in this primary event of host-pathogen 
interaction over the past decades have revealed a wide array of adhesins in a variety of 
pathogenic microbes. Presently, substantial information on the biogenesis of adhesins 

20 and the regulation of adhesin factors is available. One of the best understood 
mechanisms of bacterial adherence is attachment mediated by pili or fimbriae- Several 
afimbrial adhesins also have been reported. In addition, limited knowledge on the target 
host receptors also has been gained (Finlay, B.B. and Falkow, S 1997). 
New approaches to vaccine development focus on targeting adhesins to abrogate die 

25 colonization process (Wizemann, et al 1999). However, the specific role of particular 
adhesins has been difiScult to elucidate. Thus, prediction of adhesins or adhesin-like 
proteins and their functional characterization is likely to aid not only in deciphering the 
molecular mechanisms of host pathogen interaction but also in developing new vaccine 
formulations, which can be tested in suitable experimental model systems. 

30 One of the best understood mechanisms of bacterial adherence is attachment mediated 
by pili or fimbriae. For example, FimH and FapG adhesins of Escherichia coli (Maurer, 
L., Omdorff, P.(1987), Bock, K., et di/.(1985). Other examples of pili group adhesins 
include type IV pili in Pseudomonas aeruginosa. Neisseria species, Moraxella species, 
Enteropattiogenic Escherichia coli and Vibrio cholerae (Sperandio V et al (1996). 
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Several afimbriat adhesins are HMW proteins of Haemophilus influenzae (van 
Schilfgaarde 2000), the filamentous hemagglutinin, pertactin, of Bordetella pertussis 
(Bassinet et al 2000), the BabA otH. pylori (Yu J et al 2002) and the YadA adhesin of 
Yersinia enterocolitica (Neubauer et al 2000). The intimin receptor protein (Tir) of 

5 Enteropathogenic E. coli (EPEC) is another type of adhesin (Ide T et al 2003). Other 
class of adhesins includes MrkD protein of Kleibsella pneumoniae^ Hia ofH. influenzae 
(St Geme et al 2000), Ag I/II of Streptococcus mutans and SspA, SspB of 
Streptococcus gordonii (Egland et al 2001), FnbA, FnbB of Staphylococcus aureus and 
Sfbl, protein F of Streptococcus pyogenes , the PsaA of Streptococcus pneumoniae (De 

10 a/ 2003). 

A known example of adhesins approved as vaccine is the acellular pertussis vaccine 
containing FHA and pertactin against B. pertussis the causative agent of whooping 
cough (Halperin, S et al 2003). Immunization with FimH is being evaluated for 
protective immunity against pathogenic E. coli (Langermarm S et al 2000), in 

15 Streptococcus pneumoniae, PsaA is being investigated as a potential vaccine candidate 
against pneumococcal disease (Rapola, S et al 2003). Immunization results with BabA 
adhesin showed promise for developing a vaccine against H. pylori (Prinz, C et al 
2003). A synthetic peptide sequence anti-adhesin vaccine is being evaluated for 
protection against Pseudomonas aeruginosa infections. 

20 Screening for adhesin and adhesin like proteins by conventional experimental method 
is laborious, time consuming and expensive. As an alternative, homology search is used 
to facilitate the identification of adhesins. Although, this procedure is useful in the 
analysis ojF genome organization (Wolf et al 2001) and of metabolic pathways 
(Peregrin- Alvarez et al 2003, Risen et al 2002), it is somewhat limited in allowing 

25 functional predictions when the homologues are not functionally characterized or the 
sequence divergence is high. Assignment of functional roles to proteins based on this 
technique has been possible for only about 60% of the predicted protein sequences 
(Fraser et al 2000). Thus, we explored the possibility of developing a non-homology 
method based on sequence composition properties combined with the power of the 

30 Artificial Neural Networks to identify adhesins and adhesin-like proteins in species 
belonging to wide phylogenetic spectmm. 

Twenty years ago, Nishikawa et al carried out some of the early attempts to classify 
proteins into different groups based on compositional analysis (Nishikawa et al 1983). 
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More recently, the software PropSearch was developed for analyzing protein sequences 
where conventional alignment tools fail to identify significantly similar sequences 
(Hobohm, U. and Sander, C 1995). PropSearch uses 144 compositional properties of 
protein sequences to detect possible structural or functional relationships between a 
5 new sequence and sequences in the database. Recently the compositional attributes of 
proteins have been used to develop softwares for predicting secretory proteins in 
bacteria and apicoplast targeted proteins in Plasmodium falciparum by training 
Artificial Neural Networks (Zuegge et al 2001). 

Zuegge et al have used the 20 amino acid compositional properties. Their objective 
10 was to extract features of apicoplast targeted proteins in Plasmodium falciparum. This 
is distinct fi:om our software SPAAN that focuses on adhesins and adhesin-like proteins 
involved in host-pathogen interaction. 

Hobohm and Sander have used 144 compositional properties including isoelectric point 
and amino acid and dipeptide composition to generate hypotheses on putative 

15 functional role of proteins that are refractory to analysis using other sequence alignment 
based approaches like BLAST and FASTA. Hobohm and Sander do not specifically 
address the issue of adhesins and adhesin-like proteins, which is the focus of SPAAN 
Nishikawa et al had originally attempted to classify proteins into various functional 
groups. This was a curiosity driven exercise but eventually lead to the development of a 

20 software to discriminate extra-cellular proteins from intracellular proteins. This work 
did not address the issue of adhesins and adhesin-like proteins, which is the focus of 
SPAAN. 

Thus, none of the aforementioned research groups have been able to envisage the 
methodology of the instant application. The inventive method of this application 

25 provides novel proteins and corresponding gene sequences. 

Adhesins and adhesin-like proteins mediate host-pathogen interactions. This is the first 
step in colonization of a host by microbial pathogens. Attempts Worldwide are focused 
on designing vaccine formulations comprising adhesin proteins derived from 
pathogens. When immunized, host will have its immune system primed against 

30 adhesins for that pathogen. When a pathogen is actually encountered, the surveillance 
mechanism will recognize these adhesins, bind them through antigen-antibody 
interactions and neutralize the pathogen through complement mediate cascade and 
other related clearance mechanisms. This strategy has been successfully employed in 
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the case of Whooping cough and is being actively pursued in the case of Pneun?r>r.ia, 
Gastric Ulcer and Urinary tract infections. 
Objects of the present Invention 

The main object of the present invention is to provide a computational method for 
5 identifying adhesin and adhesin-like proteins of therapeutic potential. 

Another object of invention is to provide a method for screening the proteins with 
unique compositional characteristics as putative adhesins in different pathogens. 
Yet, another object of the invention is providing the use of gene sequences encoding 
the putative adhesin proteins useful as preventive therapeutics. 

10 Summary of the present Invention 

A computational method for identifying adhesin and adhesin-like proteins, said method 
comprising steps of computing the sequence-based attributes of protein sequences using 
five attribute modules of software SPAAN, (i) amino acid frequencies, (ii) multiplet 
frequency, (iii) dipeptide frequencies, (iv) charge composition, and (v) hydrophobic 

15 composition, training the artificial neural Network (ANN) for each of the computed 
five attributes, and identifying the adhesin and adhesin-like proteins having probability 
of being an adhesin (Pad) as > 0.51; a computer system for performing the method; and 
genes and proteins encoding adhesin and adhesin-like proteins 
Detailed description of the present Invention 

20 Accordingly, the present invention relates to a computational method for identifying 
adhesin and adhesin-like proteins, said method comprising steps of computing the 
sequence-based attributes of protein sequences using five attribute modules of software 
SPAAN, (i) amino acid frequencies, (ii) multiplet frequency, (iii) dipeptide frequencies, 
(iv) charge composition, and (v) hydrophobic composition, training the artificial neural 

25 Network (ANN) for each of the computed five attributes, and identifying the adhesin 
and adhesin-like proteins having probability of being an adhesin (Pad) as > 0.51; a 
computer system for performing the method; and genes and proteins encoding adhesin 
and adhesin-like proteins 

In an embodiment of the present invention, wherein the invention relates to a 
30 computational method for identifying adhesin and adhesin-like proteins, said method 
comprising steps of: 

a. computing the sequence-based attributes of protein sequences using five 
attribute modules of a neural network software, wherein the attributes 
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are software, (i) amino acid frequencies, (ii) multiplet frequency, (m) 
dipeptide frequencies, (iv) charge composition, and (v) hydrophobic 
composition. 



5 



b. 



training the artificial neural Network (ANN) for each of the computed 
five attributes, and 



c. 



identifying the adhesin and adhesin-like proteins having probability of 
being an adhesin (P^d) as > 0.51. 



In another embodiment of the present invention, wherein the invention relates to a 
method wherein the protein sequences is obtained from pathogens, eukaryotes, and 

10 multicellular organisms. 

In an embodiment of the present invention, wherein the invention relates to a method, 
wherein the protein sequences are obtained from the pathogens selected from a group 
of organisms comprising Escherichia coli, Haemophilus influenzae^ Helicobacter 
pylori^ Mycoplasma pneumoniae, Mycobacterium tuberculosis, Rickettsiae prowazekii, 

15 Porphyromonas gingivalis. Shigella flexneri. Streptococcus mutans. Streptococcus 
pneumoniae^ Neisseria meningitides, Stf^eptococcus pyogenes, Treponema pallidum and 
Severe Acute Respiratory Syndrome associated human coronavirus (SARS ). 
In yet another embodiment of the present invention, wherein the method of the 
invention is a non-homology method. 

20 In still another embodiment of the present invention, wherein the invention relates to 
the method using 105 compositional properties of the sequences. 

In still another embodiment of the present invention, wherein the invention relates to a 
method showing sensitivity of at least 90%. 

In still another embodiment of the present invention, wherein the invention relates to 
25 the method showing specificity of 100%. 

In still another embodiment of the present invention, wherein the invention relates to a 
method identifying adhesins from distantly related organisms. 

In still another embodiment of the present invention, wherein the invention relates to 
the neural network has multi-layer feed forward topology, consisting of an input layer, 
30 one hidden layer, and an output layer. 

In still another embodiment of the present invention, wherein the invention relates to 
the number of neurons in the input layer are equal to the number of input data points for 
each attribute. 
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In still another embodiment of the present invention, wherein the invention relates to 
the "Pad" is a weighted linear sum of the probabilities from five computed attributes. 
In still another embodiment of the present invention, wherein the invention relates to 
each trained network assigns a probability value of being an adhesin for the protein 
5 sequence. 

In still another embodiment of the present invention, wherein the invention relates to a 
computer system for performing the method of claim said system comprising a 
central processing unit, executing SPAAN program, giving probabilities based on 
different attributes using Artificial Neural Network and in built other programs of 
10 assessing attributes, all stored in a memory device accessed by CPU, a display on 
which the central processing unit displays the screens of the above mentioned programs 
in response to user inputs; and a user interface device. 

In still another embodiment of the present invention, wherein the invention relates to a 
set of 274 annotated genes encoding adhesin and adhesin-like proteins, having SEQ ID 
15 Nos. 385 to 658. 

In still another embodiment of the present invention, wherein the invention relates to a 
set of 105 hypothetical genes encoding adhesin and adhesin-like proteins, having SEQ 
ID Nos. 659 to 763. 

In still another embodiment of the present invention, wherein the invention relates to a 
20 set of 279 annotated adhesin and adhesin-like proteins of SEQ ID Nos. 1 to 279, 

In still another embodiment of the present invention, wherein the invention relates to a 
set of 105 hypothetical adhesin and adhesin-like proteins of SEQ ID Nos. 280 to 384. 
One more embodiment of the present invention, wherein the invention also relates to a 
fully connected multilayer feed forward Artificial Neural Network based on the 
25 computational method as claimed in claim 1, comprising of an input layer, a hidden 
layer and an output layer which are connected in the said sequence, wherein each 
neuron is a binary digit number and is connected to each neuron of the subsequent layer 
for identifying adhesin or adhesin like proteins, wherein the program steps comprise:- 
[a] feeding a protein sequence in FASTA format; [b] processing the sequence 
30 obtained in step [a] through the 5 modules named A, C, D, H and M, wherein attribute 
A represents an amino acid composition, attribute C represents a charge composition, 
attribute D represents a dipeptide composition of the 20 dipeptides pSTG, RE, TN, NT, 
GT, TT, DE, ER, RR, RK, RI, AT, TS, IV, SO, GS, TG, GN, VI and HR], attribute H 
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represents a hydrophobic composition and attribute M represents amino acid 
frequencies in multiplets to quantify 5 types of compositional attributes of the said 
protein sequence to obtain numerical input vectors respectively for each of the said 
attributes wherein the sum of numerical input vectors is 105; [c] processing of the 

5 numerical input vectors obtained in step [b] by the input neuron layer to obtain signals, 
wherein the number of neurons is equal to the number of numerical input vectors for 
each attribute; [d] processing of signals obtained from step [c] by the hidden layer to 
obtain synaptic weighted signals, wherein the optimal number of neurons in the hidden 
layer was determined through experimentation for minimizing the error at the best 

10 epoch for each network individually; [e] delivering synaptic weighted signals obtained 
from step [d] to the output layer for assigning of a probability value for each protein 
sequence fed in step [a] as being an adhesin by each network module; [f] using the 
individual probabilities obtained from step [e] for computing the final probability of a 
protein sequence being an adhesin denoted by the Pad value, which is a weighted 

15 average of the individual probabilities obtained from step [e] and the associated fraction 
of correlation which is a measure of the strength of the prediction. 

In still another embodiment of the present invention, wherein the input neuron layer 
consists of a total of 105 neurons corresponding to 105 compositional properties. 
In still another embodiment of the present invention, wherein the hidden layer 
20 comprises of neurons represented as 30 for amino acid frequencies, 28 for multiplet 
frequencies, 28 for dipeptide frequencies, 30 for charge composition and 30 for 
hydrophobic composition. 

In still another embodiment of the present invention, wherein the output layer 
comprises of neurons to deliver the output values as probability value for each protein 
25 sequence. 

Identification of novel adhesins and their characterization are important for studying 
host-pathogen interactions and testing new vaccine formulations. We have employed 
Artificial Neural Networks to develop an algorithm SPAAN (Software for Prediction of 
Adhesin and Adhesin-like proteins using Neural Networks) that can identify adhesin 
30 proteins using 105 compositional properties of a protein sequence. SPAAN could 
correctly predict well characterized adhesins from several bacterial species and strains. 
SPAAN showed 89% sensitivity and 100% specificity in a test data set that did not 
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contain proteins in the training set Putative adhesins identified by the software can 
serve as potential preventive therapeutics. 

The present invention provides a novel computational method for identifying adhesin 
and adhesin-like proteins of therapeutic potential. More particularly, the present 
5 invention relates to candidate genes for these adhesins. The invention further provides 
new leads for development of candidate genes, and their encoded proteins in their 
functional relevance to preventive approaches. This computational method involves, 
calculation of several sequence attributes and their subsequent analyses lead to the 
identification of adhesin proteins in different pathogens. Thus, the present invention is 

10 useftil for identification of the adhesin proteins in pathogenic organisms. The adhesin 
proteins from different genomes constitute a set of candidates for functional 
characterization through targeted gene disruption, microarrays and proteomics. Further, 
these proteins constitute a set of candidates for further testing in development of 
preventive therapeutics. Also, are provided the genes encoding the candidate adhesin 

15 proteins. 

The present method offers novelty in the principles used and the power of Neural 
Networks to identify new adhesins compared to laborious and time consuming 
conventional methods. The present method is based on compositional properties of 
proteins instead of sequence alignments. Therefore this method has the ability to 

20 identify adhesin and adhesin like proteins from bacteria belonging to a wide 
phylogenetic spectrum. The predictions made from this method are readily verifiable 
through independent analysis and experimentation. The invention has the potential to 
accelerate the development of new preventive therapeutics, which currently requires 
high investment in terms of requirement of skilled labor and valuable time. 

25 The present invention relates to a computational method for the identification of 
candidate adhesin proteins of therapeutic potential. The invention particularly describes 
a novel method to identify adhesin proteins in different genomes of pathogens. These 
adhesin proteins can be used for developing preventive therapeutics. 
Accordingly, a computational method for identifying adhesin and adhesin-like proteins 

30 of therapeutic potential which comprises calculation of 105 compositional properties 
under the five sequence attributes, namely, Amino Acid frequency, Multiplet 
frequency, dipeptide frequency, charge composition and hydrophobic composition; and 
then training Artificial Neural Network (ANN, Feed Forward Error Back Propagation) 
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using these properties for differentiating between adhesin and non-adhesin class of 
proteins. This computational method involves quantifying 105 compositional attributes 
of query proteins and qualifying them as adhesins or non-adhesins by a Pad value 
(Probabilify of being an adhesin). The present invention is useful for identification of 
5 adhesin and adhesin-like proteins in pathogenic organisms. These newly identified 
adhesin and adhesin-like proteins constitute a set of candidates for development of new 
preventive therapeutics that can be tested in suitable experim.ental model systems 
readily. In addition, the genes encoding the candidate adhesin and adhesin-like proteins 
are provided. 

10 The invention provides a set of candidate adhesin and adhesin-like proteins and their 
coding genes for further evaluation as preventive therapeutics. The method of invention 
is based on the analysis of protein sequence attributes instead of sequence pattems 
classified to functional domains. Present method is less dependent on sequence 
relationships and therefore offers the potential power of identifying adhesins from 

15 distantly related organisms. The invention provides a computational method, which 
involves prediction of adhesin and adhesin-like proteins using Artificial Neural 
Networks. The proteins termed adhesin were found to be predicted with a high 
probability (Pad 0.51) in various pathogens. Some adhesin sequences turned out to be 
identical or homologous to proteins that are antigenic or implicated in virulence. By 

20 this approach, proteins could be identified and short-listed for further testing in 
development of new vaccine formulations to eliminate diseases caused by various 
pathogenic organisms. 
DESCRIPTION OF TABLES 
Table 1: Output file format given by SPAAN. 

25 Table 2: Organism Name, Accession number. Number of base pairs. Date of release 
and Total number of proteins. 

Table 3. Prediction of well characterized adhesins from various bacterial pathogens 
using SPAAN. 

Table 4. Analysis of predictions made by SPAAN on genome scans of a few selected 
30 pathogenic organisms. 

Table 5: GI numbers and Gene IDs of new putative adhesins predicted by SPAAN in 
the genomes listed in Table 2. 
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Table 6: GI numbers and Gene IDs of hypothetical proteins predicted as putative 
adhesins by SPAAN in the genomes listed in Table 2. 
Table 7: The list of 198 adhesins found in bacteria 
Brief description of the accompanying drawings 
5 Figure 1 shows the Neural Network architecture 

Figure 2 shows assessment of SPAAN using defined test dataset. 

Figure 3 (pl) shows Histogram plots of the number of proteins in the virions Pad value 
ranges are shown, (b) Pairwise sequence relationships among the adhesins were 
determined using CLUSTAL W and plotted on X-axis. Higher scores indicate similar 
10 pairs, (c) plot for non-adhesins. Data are plotted in the 4 quadrant format for clear 
inspection. 

Software program was written in C Language and operated on Red Hat Linux 8.0 
operating system. The computer program accepts input protein sequences in FastA 
format and produces a tabulated output. The output Table contains one row for each 

15 protein listing the probability outputs of each of the five modules, a weighted average 
probability of these five modules (Pad)? and the function of the protein as described in 
the input sequence file. This software is called SPAAN (A Software for Prediction of 
Adhesins and Adhesin-iike proteins using Neural Networks) and a software cop;^'Tight 
has been filed. Although this software has multiple modules, the running of these 

20 modules have been integrated and automated. The user only needs to run one 
command. 

AAcompo.c: 

Input: File containing protein sequences in the fasta format. 

Output: File containing frequencies of all 20 AAs for each protein in one row. 

25 charge.c: 

Input: File containing protein sequences in the fasta format 

Output: File containing frequency of charged amino acids (R, K, E and D) and 

moments (up to 18th order) of the positions of charged amino acids. 

hdr.c: 

30 Input: File containing protein sequences in the fasta format. 
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Output: File containing frequencies of 5 groups of amino acids formed on the 
basis their Hydrophobicity and moments of their positions up to 5th order. 

multiplets.c: 

Input: File containing protein sequences in the fasta format. 
5 Output: File containing fractions of multiplets of each of the 20 amino acids. 

Input: File.l containing protein sequences in the fasta format. 

File.2 containing list of the significant dipeptides in dipeptide analysis. 
Output: File containing frequencies of the dipeptides listed in the input File.2 
1 0 for each protein in the input File. 1 . 

train, c: 

Input: File containing following specifications - 

1 . Number of input and output parameters. 

2. Number of nodes in the hidden layers. 

15 3. Names of the training, validate and test data files. 

4. Learning rate, coefficient of moment. 

5. Maximum number of cycles for training. 
Output: Outputs are as follows. 

1 . Output of the trained NN for the test data set. 
20 2. Values of the weight connections in the trained NN. 

3. Some extra information about training. 

recognize.c: 

Input: File containing following specifications — 



1 . Number of input and output parameters. 

25 2. Number of nodes in the hidden layers. 

3 . Names of the query input file. 

4. Name of the file containing values of the weight connections for 

trained NN. 

5. Name of the output file. 



30 Output: Outputs for the query entries calculated by the trained NN. 
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standards: 

Input: File containing protein sequences in fasta format. 

Output: File containing protein sequences in fasta format with all the new line 
characters removed lying within a sequence. 

5 filter.c: 

Input: File containing protein sequences in fasta format. 

Output: File containing protein sequences from the input except those which 
are short in length (<50 AAs) and which contain any amino acid other than the 
20 known amino acids. 
10 The five attributes: 

Amino Acid frequencies 

Amino acid frequency fj = (counts of ith amino acid in the sequence) / 1 ; i, = 1 . . .20, 1 is 
the length of the protein. 
Multiplet frequency 

15 Multiplets are defined as homopolymeric stretches (X)n where X is any of the 20 amino 
acids and n is an integer > 2. After identifying all the multiplets, the frequencies of the 
amino acids in the multiplets were computed as 
fi(m) = (counts of i**' amino acid occurring as multiplet) / 1 
Dipeptide frequencies 

20 The frequency of a dipeptide (i, j) fij = (counts of ij*^ dipeptide) / (total dipeptide 
counts); i, j ranges from 1 to 20. 

It has been found that dipeptide repeats in proteins are important for functional 
expression of the clumping factor present on Staphylococcus aureus cell surface that 
binds to fibrinogen (Hartford et al 1999). Thus we included the dipeptide frequency 

25 module. The total number of dipeptides is 400. For optimal training of Neural Network, 
the ratio of total number of input vectors to the total number of weight connections 
must be around 2 to avoid over fitting (Andrea et at). Therefore, we identified the 
dipeptides whose frequencies in the adhesin data set (469 proteins, see database 
construction) were significantly different from that in the non-adhesin dataset (703 

30 proteins) using t-test. The frequencies of top 20 dipeptides (when arranged in the 
descending order of the p-values of t-test\ were fed to the Neural Network. These 
dipeptides were (using smgle letter lUPAC-IUB code) NG, RE, TN, NT, GT, TT, DE, 
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ER, RR, RK, RI, AT, TS, IV, SG, GS, TG, GN, VI, AND HR. With frequency inputs 
for 20 dipeptides and 28 neurons in the 2nd layer, the total number of weight 
connections is 588, and is in keeping with the criterion of avoiding over fitting. 
Charge composition 

5 The input frequency of charged amino acids (R, K, E and D considering the ionization 
properties of the side chains at pH 7.2) given by fi; = (counts of charged amino acids) / 1 
Further, information on the characteristics of the distribution of the cbargec* «»mino 
acids in a given protein sequence was provided by computing the moments of the 
positions of the occurrences of the charged amino acids. Since moments characterize 
10 the patterns of distribution such as skewness and kurtosis (sharpness of flie peak) we 
have used them to represent the distribution patterns of the charged residues in the 
sequence. 

The general expression to compute moments of a given order; say *i' is 
Mr = r*^ order moment of the positions of charged amino acids 

^ N 

Where, Xm = mean of all positions of charged amino acids 
Xi ^ position of i*^ charged amino acid 
N = number of charged amino acids in the sequence 

The moments 2""^ to 19*^ order were used to train the ANN constituting a total 20 inputs 
20 in addition to frequency of charged amino acids and the length of the protein. The 

upper limit of 19^^ order was set based on assessments of sensitivity and specificity on a 

small dataset of adhesins and non-adhesins. Moments of order greater than 19 were not 

useful in improvement of performance. 

Hvdrophobic composition 
25 A given protein sequence was digitally transformed using the hydrophobic scores of the 

amino acids according to Brendel et al. (43). The scores for five groups of amino acids: 

(-8 for K, E, D, R), (-4 for S, T, N, Q), (-2 for P, H), (+1 for A, G, Y, C, W), (+2 for L, 

V, I, F, M). 

Following inputs were given for each of the group 
30 (a) fj = (counts of i*^ group) / (total counts in the protein); i ranges from 1 to 5 

(b) mji = j^^ order moment of positions of amino acids in i^^ group; j ranges from 2 to 5. 
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A total of 25 inputs representing the hydrophobic composition of a protein were fed to 
the Neural Network. The rationale for using moments was same as described in the 
section on charge composition inputs. 

Taken together a total of 105 compositional properties of a given protein sequence were 
5 used to predict their adhesin characteristics. 

The software PropSearch uses 144 compositional properties of protein sequences to 
detect possible structural or fiinctional relationships between a new sequence and 
sequences in the database (Hobohm and Sander 1995). The approach defines protein 
sequence dissimilarity (or distance) as a weighted sum of differences of compositional 

10 properties such as singlet and doublet amino acid composition, molecular weight, 
isoelectric point (protein property search or PropSearch). Compositional properties of 
proteins have also been used for predicting secretory proteins in bacteria and apicoplast 
targeted proteins in Plasmodium falciparum (Zuegge, et al. 2001). The properties used 
here are statistical methods, principal component analysis, self-organizing maps, and 

15 supervised neural networks. In SPAAN, we have used 105 compositional properties in 
the five modules viz. Amino Acid frequencies, Multiplet firequencies, Dipeptide 
firequencies. Charge composition, Hydrophobic composition. The total of 105 
properties used in SPAAN are 20 for Amino acid firequencies, 20 for Multiplets 
firequencies, 20 for Dipeptide firequencies (Top 20 significant dipeptides are used, based 

20 on t~tesi), 20 for Charge composition (fi-equency of charged amino acids (R, K, E and 
D) and moments of 2nd to 19th order), and 25 for Hydrophobic composition (Amino 
acids were classified into five groups (-8 for K, E, D, R), (-4 for S, T, N, Q), (-2 for P, 
H), (+1 for A, G, Y, C, W), (+2 for L, V, I, F, M). A total of 25 inputs consisted of the 
following: Frequency of each group. Moments of positions of amino acids in each 

25 group from 2nd to 5th order. 
Neural Network 

A feed forward error back propagation Neural Network was used. The program is a 
kind gift firom Charles W. Anderson, Department of Computer Science, Colorado State 
University, Fort Collins, CO 80523, anderson@cs.colostate.edu 
30 Neural Network architecture 

The Neural Network used here has a multi-layer feed-forward topology. It consists of 
an input layer, one hidden layer and an output layer. This is a 'fiiUy-connected' Neural 
Network where each neuron / is coimected to each unit j of the next layer (Figure 1). 
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The weight of each connection is denoted by Wy. The state li of each neuron in the input 
layer is assigned directly from the input data, whereas the states of hidden layer 
neurons are computed by the sigmoid function, 
hj = 1 / (1+ exp -(Wjo + Wij 10), 
5 where, Wjo is the bias weight 

The back propagation algorithm was used to minimize the differences between the 
computed output and the desired output. Ten thousand cycles (epochs) of iterations are 
performed. Subsequently, the best epoch with minimum error was identified. At this 
point the network produces approximate target values for a given input in the training 
10 set. 

A network was trained optimally for each attribute. Thus five networks were prepared. 
The schematic diagram (Figure 1) shows the procedure adopted. The number of 
neurons in the input layer was equal to the number of input data points for each 
attribute (for example 20 neurons for 20 numerical input vectors of the amino acid 
15 composition attribute). The optimal number of neurons in the hidden layer was 
determined through experimentation for minimizing the error at the best epoch for each 
network individually. An upper limit for the total number of weight connections was set 
to half of the total number of input vectors to avoid over fitting as suggested previously 
(Andrea et al), 

20 Computer programs to compute individual compositional attributes were written in C 
and executed on a PC under Red Hat Linux ver 7.3 or 8.0. The network was trained on 
the training set, checks error and optimizes using the validate set through back 
propagation. The validate set was different from the training set. Since, the number of 
well annotated adhesins were not many, we used the 'validate set' itself as test set for 

25 preliminary evaluation of the performance and to obtain the fraction of correlation to 
compute the weighted average probability (Pad value) described in the next section. The 
training set had 367 adhesins and 580 non-adhesins. The validate set had 102 adhesins 
and 123 non-adhesins. The adhesins were qualified with a digit T and the non-adhesins 
were qualified with a digit '0'. 

30 During predictions, the network is fed with new data from the sequences that were not 
part of training set. Each network assigns a probability value of being an adhesin to a 
given sequence. The final probability is computed as described in the next section. 
Probability of being an adhesin, the Pad value 
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Query proteins are processed modularly through network trained for each attribute. 
Thus, five probability outputs are obtained. Final prediction was computed using the 
following expression which is a weighted linear sum of the probabilities from five 
modules: 

5 

(fiA-^fic +/Cd +/% 

Pi = Probability from i module, 

fci == fraction of correlation of i module of the trained Neural Network, 
Where i = A (Amino acid frequencies), C (Charge composition), D (Dipeptide 
10 frequencies), H (Hydrophobic composition), or M (MuWplet frequencies). 

The fraction of correlation fci represents the fraction of total entries that were correctly 
predicted (Pi,adhesin > 0.5 and Pi, non-adhesin < 0.5) by the trained network on the test set 
used in preliminary evaluation (Charles Anderson). 
Neural Network 

15 A feed forward error back propagation Neural Network was used. The program was 
downloaded from the web site with permission from the author, Charles W. Anderson, 
Department of Computer Science, Colorado State University, Fort Collins, CO 80523, 
anderson@cs . colostate .edu 
Statistical Analysis 

20 All statistical procedures were carried out using Microsoft Excel (Microsoft 
Corporation Inc. USA). 
Sequence analysis 

Homology analysis was carried out using CLUSTAL W (Thompson et al 1994), 
BLAST (Altschul et al 1990), CDD (conserved domain database) search (Marchler- 

25 Bauer et al 2002). 

The whole genome sequences of microbial pathogens present new opportunities for the 
development of clinical applications such as diagnostics and vaccines. The present 
invention provides new leads for the development of candidate genes, and their 
encoded proteins in their ftmctional relevance to preventive therapeutics. 

30 The protein sequences of both the classes, i.e. adhesin and non-adhesin, were 
downloaded from the existing database QsTational Centre for Biotechnology Information 
(NCBI), USA), A total of 105 compositional properties under the five sequence 



wo 2005/076010 



17 



PCT/IN2005/000037 



attributes namely, amino acid composition, multiplet composition, dipeptide 
composition, charge composition and hydrophobic composition were computed by 
computer programs written in C language. The attributes were computed for all the 
proteins in both the databases. The sequence-based attributes were then used to train 

5 Artificial Neural Network for each of the protein attributes. Adhesins were qualified by 
the digit and non-adhesins were qualified by the digit '0'. Finally each trained 
Artificial Neural Network was used to identify potential adhesins which can be 
envisaged to be usefiil for the development of preventive therapeutics against 
pathogenic infections. Accordingly, the invention provides a computational method for 

10 identifying adhesin and adhesin-like proteins of therapeutic potential, which comprises: 

1. preparing two comprehensive data-sets of adhesin and non-adhesin proteins from 
publicly available information on protein sequences, 

2. calculating computationally the sequence based attributes of the protein sequences in 
tlie publicly available protein datasets using specially developed Software for 

15 Prediction of Adliesins and Adhesin-like proteins using Neural Networks (SPAAN), 

3. training the Artificial Neural Network (ANN) for the selected attributes, 

4. assigning probability value suitable for an adhesin, "Pad" to the query protein and 
identifying adhesin like property in the query proteins with the help of trained Artificial 
Neural Network implemented in SPAAN, 

20 5. validating computationally the protein sequences as therapeutic potentials by 
comparing with the known protein sequences that are biochemically characterized in 
the pathogen genome. 

In an embodiment of the invention the protein sequence data may be taken from an 
organism, specifically but not limited to organisms such as Escherichia coli, 

25 Haemophilus influenzae, Helicobacter pylori. Mycoplasma pneumoniae, 
Mycobacterium tuberculosis, Rickettsiae prowazekii, Porphyromonas gingivalis. 
Shigella flexneri. Streptococcus mutans. Streptococcus pneumoniae. Neisseria 
meningitides. Streptococcus pyogenes, Treponema pallidum. Severe Acute Respiratory 
Syndrome associated coronavirus. 

30 In another embodiment to the present invention different sequence-based attributes 
used for identification of proteins of therapeutic potential, comprise amino acid 
composition, charge composition, hydrophobicity composition, multiplets frequencies, 
and dipeptide frequencies. 
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In an embodiment, the non-homologous adhesin protein sequence may be compared 
with that of known sequences of therapeutic applications in the selected pathogens. 
In an embodiment of the invention, the sequences of adhesin or adhesin like proteins 
comprise sequences of sequences IDs listed in Tables 5 and 6 identified by the method 
5 of invention. 

.Another embodiment of the invention the computer system comprises a central 
processing unit, executing SPAAN program, giving probabilities based on different 
attributes using Artificial Neural Network and in built other programs of assessing 
attributes, all stored in a memory device accessed by CPU, a display on which the 
10 central processing unit displays the screens of the above mentioned programs in 
response to user inputs; and a user interface device. 

In One embodiment of the present invention, the particulars of the organisms such as 
their name, strain, accession number in NCBI database and other details are given in 
Table 2: 

15 The invention is further explained with the help of the following examples, which are 
given by illustration and should be construed to limit the scope of the present invention 
in any manner. 
Example 1 
Operating SPAAN: 

20 The purpose of the program is to computationally calculate various sequence-based 
attributes of the protein sequences. 
The program works as follows: 

The internet downloaded FASTA format files obtained firom 
http://www.ncbi.nlm.nih.gov were saved by the name <organism_name>.faa are 
25 converted in the standard format by C program and passed as input to another set of C 
programs which computes the 5 different attributes of protein sequences (a total of 105 
compositional properties in all 5 modules). 

The computed properties were fed as input to the 5 different Neural Networks. Each 
trained network assigns a probability value of being an adhesin for a query protein. The 
30 final probability (Pad) was calculated as weighted average of these five individual 
probabilities. The weights were determined from a correlation value of correct 
prediction during test runs of each of the five modules. 
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Input/Output format: 
Downloaded Files and their format: 

<organism_name>.faa: file which stores the annotation and the protein sequence. 
Input file Format: FASTA 
5 ">gi.vertline."<annotation> 
For example, 

>giA^ertline.2314605.vertline.gb.vertline.AAD08472.vertline.histidine .and glutamine- 
rich protein 

MAHHEQQQQQQANSQHHHHHHHAHHHHYYGGEHHHHNAQQHA^ 

10 AQQQQQQQAHQQQQQKAQQQNQQY 

>gi.vertline.3261822.vertline.gnl.vertline.PID.vertline.e328405 PE_PGRS 
MIGDGANGGPGQPGGPGGLLYGNGGHGGAGAAGQDRGAGNSAGLIGNGGAG 
GAGGNGGIGGAGAPGGLGGDGGKGGFADEFTGGFAQGGRGGFGGNGNTGAS 
GGMGGAGGAGGAGGAGGLLIGDGGAGGAGGIGGAGGVGGGGGAGGTGGGG 

15 VASAFGGGNAFGGRGGDGGDGGDGGTGGAGGARGAGGAGGAGGWLSGHSG 
AHGAMGSGGEGGAGGGGGARGEAGAGGGTSTGTNPGKAGAPGTQGDSGDP 
GPPG 

>gi.vertline.. . . 

Table 1 : Output file format given by SPAAN 
20 <organism_name>,out 



SN 


Pa 


Pc 


Pd 


Ph 


Pm 


Pad-value 


Protein Name 


1 


0.05683 


0.290803 


0.441338 


0.50304 


0.029503 


0.260485 


>gi.vertline.32454344.vert 
line, gb.vertline. AAP82966 
.1. 

vertline . orfl a polyprotein 
[SARS coronavirus Hong 
KongZY-2003] 


2 


0.639235 


0.166721 


0.054583 


0.935385 


0.453498 


0.462452 


>gi.vertline.32454345.vert 
line.gb.vertline.AAP82967 
.1. 

vertline.orflab polyprotein 
[SARS coronavirus Hong 
KongZY-2003] 


3 


0.65111 
1 


0.91150 
4 


0.43869 
6 


0.54394 
4 


0.92404 
4 


0.690247 


>gi.vertline.32454346.vert 
line.gb.vertline.AAP82968 
.1. 

vertline.spike glycoprotein 
[SARS coronavirus Hong 
Kong ZY-2003] 
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>gi.vert1me.32454347.vert 
line, gb.vertline. AAP82969 
.1. 

vertline.OrBa [SARS 
coronavirus Hong Kong 
ZY-2003] 


4 


0.464324 


0.655003 


0.179503 


0.008700 


0.241573 


0.300970 



Where Pa, Pc» Pd, Phj Pm are the outputs of the five Neural Networks. 
Example 2 organisms and sequence numbers 

Table 2: Organism Name, Accession number, Number of base pairs, Date of release 



5 and Total number of proteins analyzed 



Organism Name 


Accession 
Number 


Number of 
base pairs 


Date of release 


Total no. 
of proteins 


E. coli 0157H7 


NC 0026 
95 


5498450 


7-Mar-2001 


5361 


H. influenzae Rd 


NC 0009 
07 


1830138 


30-Sep-1996 


1709 


H. pylori Z99 


NC 0009 

21 


1643831 


lO-Sep-2001 


1491 


M. pneumoniae 


NC 0009 
12 


816394 


2-Apr-2001 


689 


M. tuberculosis H37Rv 


NC 0009 
62 


4411529 


7-Sep-2001 


3927 


R. prowazekii strain 
Madrid E 


NC 0009 

63 


1111523 


lO-Sep-2001 


835 


P. gingtvalis W83 


NC 0029 
50 


2343476 


9-Sep-2003 


1909 


S. flexneri 2a str. 2457T 


NC 0047 
41 


4599354 


23- Apr-2003 


4072 


S. mutans UAl 59 


NC 0043 
50 


2030921 


25-Oct-2002 


1960 


S. pneumoniae R6 


NC 0030 
98 


2038615 


6-Sep-2001 


2043 


N. meningitidis 
serogroup A strain 
Z2491 


NC 0031 
16 


2184406 


27-Sep-2001 


2065 


S. pyogenes MGAS8232 


NC 0034 
85 


1895017 


Jan 31, 2002 


1845 


T. pallidum subsp. 
pallidum str. Nichols 


NC 0009 
19 


1138011 


7-Sep-2001 


1036 


Severe Acute 
Respiratory Syndrome 
(SARS) associated 
coronavirus Frankfurt 1 


AY29131 
5 


29727 


ll-JUN-2003 


14 


SARS coronavirus HSR 
1 


AY32397 
7 


29751 


15-OCT-2003 


14 
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SARS coronavirus ZJOl 


AY29702 

8 


29715 


19-MAY-2003 


3 


SARS coronavirus TWl 


AY29145 • 
1 


29729 


14-MAY-2003 


11 


SARS coronavirus 
CUHK-SulO 


AY28275 
2 


29736 


07-MAY-2003 


4 


SARS coronavirus 
Urbani 


AY27874 
1 


29727 


12-AUG-2003 


12 


SARS coronavirus 


NC 0047 
18 


29751 


9-Sep-2003 


29 


SARS coronavirus TotZ 


AY27411 
9 


29751 


16-MAY-2003 


15 


SARS coronavirus GDOl 


AY27848 
9 


29757 


18-AUG-2003 


12 


SARS coronavirus 
CUHK-Wl 


AY27855 
4 


29736 


31-JUL-2003 


11 


SARS coronavirus BJOl 


AY27848 
8 


29725 


Ol-MAY-2003 


11 



Example 3 

The multi-layered feed forward Neural Network architecture implemented in SPAAN 
(figure 1). A given protein sequence in FASTA format is first processed through the 5 
5 modules A, C, H, and M to quantify the five types of compositional attributes. A: 
Amino acid composition, C: Charge composition, D: dipeptide composition of the 20 
dipeptides (NG, RE, TN, NT, GT, TT, DE, ER, RR, RK, RI, AT, TS, IV, SG, GS, TG, 
GN, VI, HR), H: Hydrophobic composition, M: Amino acid frequencies as Multiplets. 
The sequence shown is part of the FimH precursor (gi 5524634) of E. coli, 

10 Subsequently, these numerical data are input to the input neuron layer. The directions 
of arrows show data flow. The number of neurons chosen in the input layer was equal 
to the number of the numerical input vectors of each module. The network was 
optimally trained through minimization of error of detection based on validate set 
through back propagation. The details are described in the methods. Each network 

15 module assigns a probability value of the protein being an adhesin based on the 
corresponding attribute. The final probability of a protein sequence being an adhesin is 
the Pad value a weighted average of the individual probabilities and the associated 
fraction of correlation which is a measure of the strength of the prediction. 
Example 4 

20 Performance of SPAAN assessed using a test set of 37 adhesins and 37 non-adhesins 
that were not part of the training set. Matthew's correlation coefficient (Mcc, plotted on 
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10 



15 



20 



25 



Y-axis) for all the proteins with Pad values above a given threshold (plotted on X-axis) 

(figure 2). The Matthew's correlation is defined as: 

^^^^ (rP*77V)-(FP*iW) 

•yliTN + FN)(m + FP)(TP + FN)(TP + FP) 

Where TP == True Positives, TN = True Negatives, FP = False Positives, FN = False 
Negatives. 

Here TPs are adhesins^ TNs are non-adhesins. In general, adhesins have high Pad value, 
whereas non-adhesins have low Pad value. Thus known adhesins with Pad value above a 
given threshold are true positives whereas known non-adhesins with Pad value below 

TP 



the given threshold are true negatives. The sensitivity, Sn is given by 



and 



TP + FN^ 
f TP ^ 

specificity, Sp is given hy\ yp^ ^pp I' ^^^^^ negatives are those cases, wherein a 

known adhesin had Pad value lower than the chosen threshold. Similarly, a known non- 
adhesin with a Pad value higher than the chosen threshold was taken as false positive. A 
theoretical polynomial curve of second order (dashed line) was fitted to the observed 
curve (smooth line) with a Karl-Pearson correlation coefficient = 0.9799. The 
maximum point of the theoretical curve (where first derivative vanishes and second 
derivative is negative) was chosen as reference (vertical dotted line) to identify the 
maximum Mcc = 0.94 on the observed curve (shown by arrow). The corresponding Pad 
value threshold was 0.51. At this Pad value threshold, Sn and Sp were 0.89 and 1.0 
respectively. Note that the Mcc does not drop down to the x-axis because the highest 
Pad value attained by adhesins was 0.939 in comparison to the theoretical attainable 
limit of l.O. 
Example 5 

Assessment of SPAAN on well known adhesins from various bacterial pathogens. 
Table 3. Prediction of well characterized adhesins firom various bacterial pathogens 
using SPAAN. 



Species 


Disease 
caused 


Adhesin* 


Host ligand 


Pad value'^ 
(Range) 


E. coll 


Diarrhoea 


PapG (27) 
SfaS (5) 


a-D-gal(l-4) P-D-Gal- 
containing receptors 
alpha-sialyl-beta-2,3-b- 
galactose 


0.84-0.76 
0.94-0.94 


FimH (63) 


D-mannosides 


0.96-0.23" 
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Intimin (12) 


tvrosine-Dhosnhorvlated 
form of host cell receptor 
Hp90 


0.95-0.78 


PrsG (5) 


Gal(alphal-4)Gal 


0.86-0.85 


Nontypeable H, 
influenzae 


Influenza 


HMWl, 
HMW2 


Human epithelial cells 


0.97 


Hia (8) 


human conjuctival cells 


0.93-0.90 


ri. injluenzae 


bacterial 
meningitis^ 


HifE (18) 


Sialylyganglioside-GM 1 


0.85-0.73 


K. pneumoniae 


Pneumonia 


MrkD 


type V coUe-gen 


0.8? 
0.85 


B. pertussis 


Whooping 
cough 


FHA 


Sulphated sugars on cell- 
surface glycoconjugates 


Pertactin 


Integrins 


0.43 


7. enterocolitica 


Enterocolitis 


YadA (5) 


Pi integrins 


0.88-0.79 


S. mutans 


Dental 
Caries 


SpaP (2) 
PAc 


Salivary glycoprotein 
Salivary glycoprotein 


0.88, 0.87 
0.88 


Streptococcus 
gordonii 


Oral cavity 


SspA (2) 


Salivary glycoprotein 


0.85,0.84 


CshA 


Fibronectin 


0.78 


CshB 


Fibronectin 


0.63 


ScaA 


Co-aggregation 


0.71 


SspB (2) 


Salivary glycoprotein 


0.85,0.84 


Streptococcus 
sobrinus 


Tooth decay 


SpaA 
PAg(2) 


Salivary glycoprotein 
Salivary glycoprotein 


0.89 

0.89, 0.73 


Streptococcus 
pyogenes 


Scarlet 
Fever 


Protein F 


Fibronectin 


0.49 


Streptococcus 
pneumoniae 


Bacterial 
Pneumonia 


PsaA (5) 


Human nasonharvnizeal 
cells 


0.82-0.78 


CbpA* / 
SpsA / 
PbcA/ PspC 


phosphorylcholine of the 
teichoic acid. 


0.81-0.49 


Streptococcus 

parasanguis 


Valve 

endocarditis 


FimA 


Salivary glycoprotein fibrin 


0.76 


Streptococcus 
sanguis 


Tooth Decay 


SsaB 


Salivary glycoprotein 


0.71 


Enterococcus 
faecalis 


Empyma in 
patients with 
liver disease 


EfaA 


Unknown 


0.83 


Staphylococcus 

aureus 


Food 
Poisoning 


FnbA 
FnbB (3) 


Fibronectin 
Fibronectin 


0.8 

0.78, 0.77, 
0.69 


Helicobacter 
pylori 


Peptic 
Ulcers 


BabA(17) 


difiicosylated Lewis" blood 
group antigen 


0.87-0.68 



The number of sequences from different strains and homologs from related species 
analyzed are shown in parantheses. 
^: Rounded off to the second decimal. 
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Out of 63 FimH proteins, 54 were from E. coli, 6 from Shigella flexineri, 2 from 
Salmonella enterica and 1 was from Salmonella typhimurium. Except 2 FimH proteins, 
the rest had Pad 0.51 . The 2 exceptions (gi numbers: 5524636, 1778448) were from E, 
coli. The gi:5524636 protein is annotated as a FimH precursor but is much shorter (129 
5 amino acids) than other members of the family. The gi: 1778448 protein is a & 
typhimurium homolog in E. coli. 

Other ailments include pneumonia, epiglottitis, osteomyelitis, septic arthritis and 
sepsis in infants and older children. 

^: The adhesin CbpA is also known by alternative names SpsA, PbcA and PspC. A total 
10 of seven sequences were analyzed. Except 1 PspC sequence, the rest all had Pad 0.51 . 
Example 6 

Ability of SPAAN to discriminate adhesins from non-adhesins at Pad 0.51 (figure 3- 
a). 

Example 7 

15 The non-homology character of SPAAN assesses in both adhesins and non-adhesins 
(figure 3b and 3c). 

Figure 3 (a - c). SPAANT is non-homology based software. A total of 130 adhesins and 
130 non-adhesins were analyzed to assess whether the predictive power of SPAAN 
could be influenced by the sequence relationships, (a) Histogram plots of the number of 

20 proteins in the various Pad value ranges are shown. Shaded bars represent adhesins 
whereas open bars represent non-adhesins. Note the SPAAN's ability to segregate 
adhesins and non-adhesins into two distinct cohesive groups, (b) Pairwise sequence 
relationships among the adhesins were determined using CLUSTAL W and plotted on 
X-axis. Higher scores indicate similar pairs. The corresponding differences in Pad 

25 values in the same protein pair was plotted on the Y-axis. Each point in the diagram 
represents a pair. Arrow points to protein pairs of the FimH family with high APad 
values in spite of high similarity; Since one of the FimH proteins (gi: 5524636) had 
very low Pad value all pairs with this false negative protein show high APad values. The 
protein (gi: 5524636) is of much shorter length compared with other members of the 

30 same family, (c) plot for non-adhesins. Data are plotted in the 4 quadrant format for 
clear inspection. Note that among protein pairs with CLUSTAL W score < 20 the 
majority (82% in adhesins and 86% in non-adhesins) have APad < 0.2. These data 
support the non-homology character of SPAAN. 
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Example 8 

Genomescati of pathogens by SPAAN identifies well known adhesins and new 
adhesins and adhesin-like proteins 

Table 4. Analysis of predictions made by SPAAN on genome scans of a few selected 



5 pathogenic organisms' 



Species 

Protein 
Class 


Escherichia coll 
0157:H7 


MycobacteriuTh 
tuberculosis EB7Rv 


oAKo associaiea 
corona virus (11 
strains) 


Total number of 
proteins with Pad 0.5 1 


575 


43 D 




jSJiown aunebmo 


X / 






Putative proteins with 
adhesin Hke 
characteristics 


92" 


105 




H5^othetical proteins 
with adhesin-like 
characteristics 








Proteins likely to be 
extracytoplasmic or 
located at surface 




191"^ 


5" 


Phage proteins 


30' 






Others 


13^ 


6* 




Hypothetical proteins 


157" 


86" 




Wrong predictions 


54' 


47' 





^: SPAAN has general applicability. The three pathogens chosen here are those in 

which intense investigations are being conducted presently. M tuberculosis is of 

special importance to developing countries. 
10 ^: Fimbrial adhesins, AidA-I, gamma intimm, curlin, translocated intimin receptor, 

putative adhesin and transport, Iha, prepilin peptidase dependent protein C. 

^: These proteins have been annotated as proteins with a putative function. These 

sequences were analyzed using CDD (Conserved domain database, NCBI) and BLAST 

searches. Adhesin like domains were found in these proteins. 
15 "^r These proteins have been annotated as 'hypothetical'. These sequences were 

analyzed using CDD and BLAST searches. Adhesin like domains were found in these 

proteins. 
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^: These proteins are outer membrane, extracellular, transport, surface, exported, 
flagellar, periplasmic lipoprotein, and proteins annotated as 'hypothetical' but found to 
have similar functions listed here using BLAST and CDD searches. 
^: The phage proteins were of the foUowmg functional roles - tail fiber, head 

5 decoration, DNA injection, tail, major capsid, host specificity, endolysin. 

^: Proteins predicted by SPAAN but not readily classifiable into the classes listed here 
have been collectively grouped as 'Others'. However, some of these proteins are known 
to participate in host-pathogen interactions. The annotated functional roles are typelll 
secretion, antibiotic resistance, heat shock, acid shock, structural, tellurium resistance, 

10 terminase, Hcp-like, Sec-independent translocase, uncharacterized nucleoprotein, 
HicB-like. 

^: These proteins have been annotated as hypothetical Re-analyses of these proteins 
using BLAST and CDD failed identify any function for these proteins. 
These proteins have been annotated with functional roles that are very likely to occur 

15 within tlie cell. Hence these proteins may have remote possibility of functioning as 
adhesins or adhesin-like proteins. Therefore this set of proteins have been incorrectly 
predicted as adhesins or adhesin-hke by SPAAN. 

These proteins are PE_PGRS, PE proteins. Several reports (for example Brennan et 
al) indicate that PE_PGRS proteins may be localized to cell surface and aid in host- 

20 pathogen interaction. 

^: Lipoproteins (Ipp, Ipq, Ipr), PPE, outer membrane, surface, transport, secreted, 
periplasmic, extracellular, ESAT-6, peptidoglycan binding, exported, mpt (with 
extracellular domains), and proteins annotated as 'hypothetical' but found to have 
similar functions listed here using BLAST and CDD searches. 

25 \ These proteins were of the following functions - glutaredoxin-like thioltransferase, 
putative involvement in molybdate uptake, ATP synthase chain, sulphotransferases, 
S.erythraea rhodanese-like protein M29612|SERCYSA_5, unknown function. 

These proteins were the spike glycoprotein with antigenic properties, and nsp2, nsp5, 
nsp6 and nsp7. 
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Table 5: New putative adhesins predicted by SPAAN in the genomes listed in table 
2- 

(Total number = 279) 

Protein GI Gene ID Protein name 

Number 

Escherichia coli 0157:H7 

13360742 912619 hemagglutinin/hemolysin-related protein 

putative ATP-binding component of a transport system 
putative tail fiber protein 

minor fimbrial subxmif D-mannose specific adhesin 
putative fimbrial-like protein 
AidA-I adhesin-like protein 
putative jSmbrial protein 
putative invasin 
putative invasin 
Gamma intimin 

putative DN A transfer protein precursor 
putative fimbrial protein 
AidA-I adhesin-like protein 
putative fimbrial-like protein 
putative fimbrial-like protein 

putative ATP-binding component of a transport system 
putative flagellin structural protein 
putative type 1 fimbrial protein precursor 
curlin major subunit CsgA 
translocated intimin receptor Tir 
putative major pilin protein 

putative ATP-binding component of a transport system and 
adhesin protein 

export and assembly outer membrane protein of type 1 
fimbriae 

homolog of Salmonella FimH protein 



13362986 

13361114 

13364757 

13362687 

13360856 

13364140 

13359793 

13364768 

13364034 

13362703 

13364141 

13359819 

13360480 

13362692 

13362585 

13359881 

13361579 

13360880 

13364036 

13360740 

13361582 



914770 

913228 

913676 

915687 

912599 

915374 

914435 

913650 

915471 

915668 

915376 

914463 

917768 

915681 

916824 

914526 

917311 

913991 

915465 

912615 

917317 



13364754 913683 



13360484 917767 
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13364751 913688 major type 1 subimit fimbrin 

1 33 59597 9 1 3742 putative fimbrial protein 

13362550 916787 putative ATP-binding component of a transport system 

13359595 913739 putative fimbrial protein 

13359599 913748 probable outer membrane porin protein involved in fimbrial 

assembly 

1 3363900 9 ) 5704 putative fimbrial protein precursor 

13361575 917307 putative fimbrial-like protein 
13364756 913678 fimbrial morphology 

133 63496 9 1 6 1 42 truncated putative fimbrial protein 

13359601 913761 putative fimbrial-like protein 

133 64 145 915368 putative type 1 fimbrial protein 

13363902 915708 putative outer membrane usher protein precursor 

13361576 917309 putative outer membrane protein 
13361013 913353 putative major tail subunit 
13364755 913682 fimbrial morphology 

13360738 912793 putative outer membrane usher protein 
13363928 915608 alpha-amylase 

13363495 916144 putative outer membrane protein 

13362383 916617 putative type-1 fimbrial protein 

13364373 914972 outer membrane vitamin B12 receptor protein BtuB 

13360879 912479 minor curlin subunit precursor CsgB 

13360739 912756 putative chaperone protein 
13361574 917314 putative fimbrial-like protein 
13361 127 913212 outer membrane protease precursor 
13363210 916442 putative lipoprotein 

13361104 913238 major tail protein 

13361709 917446 putative major tail subunit 

13359725 914366 outer membrane pore protein PhoE 

1 3360875 91 3765 curli production assembly/transport component CsgF 

13362170 913927 putative outer membrane protein 

13361473 917203 putative BigB-like protein 
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13364025 915286 EspF protein 

13360081 916982 outer membrane receptor for ferric enterobactin (enterochelin) 

and colicins B and D 

13362977 914779 hypothetical lipoprotein 

13360351 917632 outer membrane protein X 

13360696 914208 putative outer menibrane precursor 

] 3361456 91 7206 putative outer menrbrane protein 

13361 626 9 1 7374 putative outer host membrane protein precursor 

13361698 917449 putative outer membrane protein 

133 62 186 913421 putative outer membrane protein precursor 

13362697 915676 long-chain fatty acid transport protein FadL 

13360918 914188 flagellar hook protein FlgE 

13360737 912506 putative outer membrane protein 

13360342 917629 putative outer membrane receptor for iron transport 

13363396 916248 outer membrane channel TolC 

13361958 912705 putative scaffolding protein in the formation of a murein- 

synthesizing holoenzyme 

13359921 914566 nucleoside-specific channel-forming protein TSX 

13360944 913890 outer membrane receptor for ferric iron uptake 

13359998 914644 putative outer membrane transport protein 

13363390 91625 1 putative ferrichrome iron rec^tor precursor 

1 3364227 915153 outer membrane pliospholipase A 

1 336 1 982 91 2846 putative outer membrane protein 

13360129 917032 a minor lipoprotein 

13361817 912692 putative outer membrane protein 

13360233 917507 membrane spanning protein TolA 

1 3 362837 9 1 52 1 8 putative outer membrane lipoprotein 

13362328 912985 putative colanic acid biosynthesis glycosyl transferase 
Haemophilus influenzae Rd 

1 6272254 94952 1 prepilin peptidase-dependent protein D 

16272928 950762 immunoglobin Al protease 

16272129 951072 lipoprotein 
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16273251 


950616 


hemoglobin-binding protein 


30995429 


950130 


opacity protein 


16272854 


949634 


protective surface antigen D15 


16272283 


950648 


opacity associated protein 


16272604 


949701 


hemoglobin-binding protein 


Helicobacter pylori J99 




415'^101 


889157 


putative vacuolating cytotoxin (Vac A) paralos 


4154798 


890022 


putative vacuolating cytotoxin (VacA) paralog 


4155426 


890036 


putative vacuolating cytotoxin (VacA) paralog 


4155390 


890075 


vacuolating cytotoxin 


4155400 


890058 


outer membrane protein - adhesin 


4155681 


889718 


putative Outer membrane protein 


4155420 


890042 


Outer membrane protein/porin 


4155775 


889799 


outer membrane protein - adhesin 


4155419 


890044 


Outer membrane protein/porin 


4154526 


889066 


putative Outer membrane protein 


4154724 


889419 


putative Outer membrane protein 


4155862 


890404 


putative Outer membrane protein 


4156048 


889958 


putative IRON(III) DICITRATE TRANSPORT PROTEIN 


4154510 


889297 


putative Outer membrane protein 


4155432 


889515 


putative outer membrane protein 


4155623 


889671 


putative Outer membrane protein 


4155700 


889739 


putative Outer membrane function 


4154740 


889426 


Outer membrane protein/porin 


4155692 


889743 


putative Outer membrane protein 


4155594 


889648 


putative outer membrane protein 


4155680 


889719 


putative Outer membrane protein 


4155217 


890243 


putative Outer membrane protein 


4155958 


889905 


putative Outer membrane protein 


4155201 


890259 


putative Outer membrane protein 


4155013 


889232 


cag island protein 


4154974 


889032 


putative Outer membrane protein 
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4155214 890244 putative Outer membrane protein 
4154973 889042 Outer membrane protein 
4155344 8901 15 putative Outer membrane protein 
4155099 889160 FLAGELLIN A 
4155023 888978 cag island protein 

4155035 889201 cag island protein, CYTOTOXICITY ASSOCIATED 

IMMUNODOMINANT ANTIOEN 
4155289 890164 NEURAMINYLLACTOSE-BINDING HEMAGGLUTININ 

PRECURSOR 

Mycoplasma pneumoniae 

1350788 1 877207 involved in cytadherence 
13507880 877268 ADPl^MYCPN adhesin PI 
1 3508228 8772 1 1 species specific lipoprotein 
13508181 877124 species specific lipoprotein 

13508179 877071 MoUicute specific lipoprotein, MG307 homology from M. 

genitalium 

MoUicute specific lipoprotein, MG307 homolog, from M. 
genitalium, 

MoUicute specific lipoproteia, MG307 homolog, from M. 
genitalium 

13508175 876848 MoUicute specific lipoprotein, MG307 homology from M. 

genitalium 

involved in cytadherence 
similar to phosphate binding protein Psts 



13508178 877118 



13508176 876797 



13508106 
13508350 



876953 
877112 

Mycobacterium tuberculosis H37 Rv 

15607496 886491 PPE 

15607445 886592 PPE 

15610644 888270 PE_PGRS 

15608588 886605 PE_PGRS 

15609627 887941 PE__PGRS 

15610643 888256 PE^PGRS 

15607718 887725 PE PGRS 
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15609054 


885362 


PPE 


15610486 


888113 


PPE 


15610483 


888120 


PPE 


15610479 


888033 


PPE 


15609771 


888573 


PE_PGRS 


15610648 


888306 


PEJPGRS 


15610481 


888114 


PE_PGRS . . 


15608117 


885264 


PE_PGRS 


15607973 


885391 


PE_PGRS 


15608231 


885258 


PE_PGRS 


15608906 


885429 


PE_PGRS 


15608891 


885544 


PPE 


15609990 


888171 


PE_PQRS 


15609055 


885506 


PPE 


15608227 


887094 


PE_PGRS 


15610524 


888151 


PE_PGRS 


15609490 


886003 


PPE 


15607886 


888664 


PE_PGRS 


15609624 


887909 


PE_PGRS 


15607420 


886621 


PE_PGRS 


15608897" 


885325 


PE_PGRS(wag22) 


15608590 


886595 


PE_PGRS 


15609728 


887992 


PE_PGRS 


15608012 


885742 


PE_PGRS 


15608534 


886745 


PE_PGRS 


15608940 


885730 


PE_PGRS 


1 ^/^HTSST 
1 jOU /oo / 


oooOOZ 




15609235 


888312 


PE_PGRS 


15610694 


887822 


PPE 


15609533 


885517 


PE_PGRS 


15610480 




PE PGRS 



Rickettsia prowazeldi strain Madrid E 
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15604316 


883411 


CELL SURFACE ANTIGEN (sca3) 


15604546 


883694 


CELL SLTRFACE ANTIGEN (sca5) 


Porphyromonas gingivalis W83 


34541453 


2551934 


hemagglutinin protein HagA 


34540040 


2551409 


lipoprotein, putative 


34540364 


2552375 


extracellular protease, putative 


34541613 


2552074 


hemagglutinin protein HagE 


34540183 


2551891 


intemalin-related protein 


Shigella flexneri 2a str. 2457T 


30065424 


1080663 


minor fimbrial subimit, D-marmose specific adhesin 


30062726 


1077662 


putative adhesion and penetration protein 


30063758 


1078834 


putative fimbrial-like protein 


30065431 


1080671 


major type 1 snibunit fimbrin (pilin) 


30063366 


1078379 


flagellar protein FliD 


30064308 


1079668 


outer membrane fluffing protein 


30062613 


1077555 


flagellar hook protein FlgE 


30061954 


1076843 


conserved hypothetical lipoprotein 


30065173 


1080393 


putative lipase 


30065425 


1080664 


minor fimbrial subuni^ precursor polypeptide 


30064485 


1079637 


putative fimbrial protein 


30062615 


1077558 


flagellar basal body L-ring protein FlgH 


30064307 


1079452 


outer membrane fluffing protein 


30065601 


1080859 


putative gjycoprotein/receptor 


30062118 


1077025 


putative fimbrial-like protein 


30064099 


1079223 


lipoprotein 


30062616 


1077559 


flagellar basal body P-ring protein Flgl 


30063546 


1078596 


putative fimbrial-like protein 


30062940 


1077910 


putative outer membrane protein 


30065426 


1080665 


minor fimbrial subunit, precursor polypeptide 


30062779 


1077721 


putative outer membrane protein 


30064194 


1079329 


putative lipoprotein 


30063365 


1078378 


flagellin 
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30062298 

30064968 

30061858 

30062178 

30062479 

30062565 

30063880 

30064531 

30065033 

Streptococcus 

24378550 

24379087 
24380463 
24379075 
24378955 
24379801 
24379528 
24379231 
24380488 

24380291 
24379342 
24380047 



24378698 1029755 



24378708 
24379427 

24379272 
24379641 



1 077222 outer membrane protein X 

1 080 1 75 putative major fimbrial subunit 

1076740 outer membrane pore protein E (E,Ic,NmpAB) 

1080410 minor lipoprotein 

1 0774 1 2 putative fimbrial-like protein 

1 077506 minor curlin subunit precursor 

1078972 putative outer membrane lipoprotein 

1079686 cytoplasmic membrane protein 

1080243 putative receptor protein 

miitans UA159 

1029610 putative secreted antigen GbpB/SagA; putative peptidoglycan 
hydrolase 

cell surface antigen SpaP 
putative membrane protein 
penicillin-binding protein 2b 

penicillin-binding protein la; membrane carboxypeptidase 
glucan-binding protein C, GbpC 
hypothetical protein; possible cell wall protein, WapE 
putative glucan-binding protein D; BglB-like protein 
conserved hypothetical protein; possible transmembrane 
protein 

putative amino acid binding protein 

putative penicillin-binding protein, class C; fint-like protein 
putative ABC transporter, branched chain amino acid-binding 
protein 

putative ABC transporter, metal binding lipoprotein; surface 
adhesin precursor; saliva-binding protein; lipoprotein receptor 
Lral (Lral family) 
1 029768 putative transfer protein 
1 02833 1 cell wall-associated protein precursor WapA 
1 028 1 96 putative amino acid transporter, amino acid-binding protein 
1 0285 1 1 putative ABC transporter, amino acid binding protein 



1028055 
1029310 
1028046 
1027967 
1028662 
1029536 
1028158 
1029325 

1029139 
1028247 
1028904 
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Streptococcus pneumoniae R6 



15902395 


934801 


Choline-binding protein 


15902381 


934810 


Choline-binding protein F 


15902165 


932894 


Surface protein pspA precursor 


15904047 


934859 


Choline binding protein D 


15904036 


933487 


Choline binding protein A 


15903986 


933069 


Choline-binding protein 


15903796 


933669 


Autolysin (N-acetylmuramoyl-L-alanine amidase) 


Neisseria meningitidis 


Z2491 


15794121 


907145 


putative membrane protein 


15794144 


907168 


putative surface fibril protein 


15793284 


906275 


truncated pilin 


15793460 


906456 


IgA-specific serine endopeptidase 


15793282 


906273 


fimbrial protein precursor (pilin) 


15793337 


906332 


adhesin 


15793253 


906243 


putative lipoprotein 


15794356 


907848 


putative lipoprotein 


15793684 


906699 


putative membrane protein 


15793290 


906281 


truncated pilin 


15793283 


906274 


truncated pilin 


15793475 


906471 


haemoglobin-haptoglobin-utilization protein 


15793406 


906401 


porin, major outer membrane protein P.I 


15794985 


907333 


adhesin MafA2 


15794344 


907836 


putative lipoprotein 


15794622 


908118 


hypothetical outer membrane protein 


15793599 


906604 


pilus-associated protein 


15793763 


906779 


putative periplasmic binding protein 


Streptococcus pyogenes MGAS8232 


19745214 


995235 


putative secreted protein 


19746570 


994224 


putative penicillin-binding protein la 


19745593 


994771 


putative 42 kDa protein 


19745813 


993958 


putative adhesion protein 
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19745225 


994839 


putative choline binding protein 


19745828 


995250 


streptolysin S associated protein 


19746229 


995021 


putative minor tail protein 


19746909 


994105 


putative laminin adhesion 


19745560 


995061 


putative cell envelope proteinase 


Treponetna 


pallidum subsp. pallidum str. Nichols 


15639714 


2611034 


flagellar hook protein (flgE) 


15639609 


2611657 


tpr protein J (tprJ) 


15639111 


2610909 


tpr protein C (tprC) 


15639125 


2610968 


tpr protein D (tprD) 



SARS coronavirus 

31581505 

32187357 

32187342 

30698329 

30421454 

30027620 

29836496 1489668 

30795145 
31416295 
30023954 

30275669 
29837498 

29837501 
29837503 

29837502 



spike protein S [SARS coronavirus Frankfurt 1] 

spike protein S [SARS coronavirus HSR 1] 

spike glycoprotein [SARS coronavirus ZJOl] 

putative spike glycoprotein S [SARS coronavirus TWl] 

putative spike glycoprotein [SARS coronavirus CUHK-SulO] 

S protein [SARS coronavirus Urbani] 

E2 glycoprotein precursor; putative spike glycoprotein [SARJS 
coronavirus] 

spike glycoprotein [SARS coronavirus Tor2] 

spike glycoprotein S [SARS coronavirus GDOl] 

putative E2 glycoprotein precursor [SARS coronavirus 

CUHK-Wl] 

spike glycoprotein S [SARS coronavirus B JO 1] 

3C-like proteinase nsp5-ppla/pplab (3CL-PRO) [SAILS 

coronavirus] 

putative nsp8-ppla/pp lab [SARS coronavirus] 
putative nsplO-ppla/pplab; formerly known as growth-factor- 
like protein [SARS coronavirus] 
putative nsp9-pp 1 a/pp 1 ab [SARS coronavirus] 
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Table 6: Hypothetical proteins predicted as putative adhesins by SPAAN in the 

genomes listed in table 2 — 

(Total number of proteins = 105) 

Protein GI Gene ID 

number 

Escherichia coliOlSl'iRl 
13363955 915578 
13360000 914929 
13362244 912369 
13359999 914888 
13361583 917316 
13361172 913156 
13361131 913207 
13359780 914422 
13360571 912499 
13362197 912893 
13362260 912399 
13360947 913505 
13361464 917196 
13361635 917367 
13362421 916655 
13361463 917195 
Haemophilus influenzae Rd 
16272115 951058 
30995442 950581 
Helicobacter pylori J99 
4155526 889586 
4155712 889748 
4155632 889684 
4156035 889468 
4155499 
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Mycoplasma pneumoniae 

13507870 877230 
13508239 877245 
13508109 876868 
13508025 877084 
13507838 876784 
13507883 877183 

13507871 877239 
13507944 877056 
13508241 876750 
13507942 877055 
13507840 877387 
13507867 877242 
13508201 877044 
13507941 876985 
13508114 877397 
Mycobacterium tuberculosis H37Rv 
15611014 886198 
15610173 887320 
15609513 885515 
15608094 885411 
15610958 886155 
15607528 886436 
15607678 887473 
15609587 885760 
15610708 887227 
15609526 885246 
15611033 886225 
15609028 885094 
15607730 887771 

15609121 885813 

15608255 885951 



wo 2005/076010 

39 

15608409 887039 
15609124 885815 
15607734 887797 
Rickettsia prowazekii strain Madrid E 
15604649 883964 
15604322 883472 
15604659 883996 
15604417 883217 
Porphyromonas gingivalis W83 
34540233 2551594 
Shigella flexneri 2a str. 2457T 
30062687 1077638 
30062956 1080449 
30063681 1078754 
30065435 1080675 
30063891 1078983 
30063211 1078195 
30065233 1080463 
30064387 1079531 
30062638 1077590 
30065236 1080466 
30061839 1076721 
Streptococcus mutans UA159 
24378864 1029452 
24380475 1029319 
24380237 1029088 
24379203 1028139 
24380480 1029320 
24379275 1029489 
24379291 1028216 
24379295 1028215 
24379804 1028663 
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24379162 1029417 
24378987 1029363 
24379179 1028118 
24379166 1028107 
24378827 1029444 
24380216 1029067 
Streptococcus pneumoniae R6 
15902140 932867 
15903446 934616 
15903916 934001 
15903848 933609 
15902832 934332 
15902372 934804 
15902152 932889 
Neisseria meningitidis Z2491 
15793668 906680 
15794714 907603 
Streptococcus pyogenes MGAS8232 

19747011 993608 
19747024 994165 

19747012 994373 
19746396 995057 
19746651 993824 
19745883 995045 
19745912 994077 

Treponema pallidum subsp. pallidum str. Nichols 
15639844 2611061 
15639720 2611059 

Table 7: The list of 198 adhesins found in bacteria 
PapG (E. coli) 

12837502 
7407201 
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7407207 

7407205 

147096 

4240529 

7407203 

42308 

7443327 

78746 

18265934 

26111419 

26250987 

26109826 

26249418 

13506767 

42301 

78745 

129622 

147092 

13506906 

7407209 

147080 

281926 

7407199 

147100 

78744 

SfaS (Exoli) 

477910 

264035 

42959 

134449 

96425 



FimH (E.coli) 
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26251208 

26111640 

5524634 

29422425 

5524630 

29422435 

29422415 

10946257 

29422419 

11120564 

29422457 

11120562 

29422459 

5524632 

29422455 

29422453 

29422451 

29422449 

29422447 

29422445 

29422443 

29422437 

29422433 

29422431 

29422429 

29422427 

29422423 

29422421 

29422417 

729494 

1361011 

1790775 
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3599571 

29422441 

12620398 

29422439 

5524628 

1787779 

1742472 

1742463 

15801636 

25321294 

12515169 

11120566 

24051859 

24112911 

13360484 

15800801 

15830279 

25392018 

25500156 

12514120 

1787173 

16128908 

16501811 

16759519 

24051219 

24112354 

30040724 

30062478 

6650093 

5524636 

1778448 

Intimin (E.coli) 



wo 2005/076010 PCT/IN2005/000037 

44 



17384659 

4388530 

1389879 

15723931 

4323336 

4323338 

4323340 

4323342 

4323344 

4323346 

4323348 

4689314 

PrsG (E.coli) 

42523 

42529 

7443328 

7443329 

1172645 

HMWl (NontypeableH. influenzae) 

282097 

HMW2 (Nontypeable H. mfluenzae) 

5929966 

Hia (Nontypeable H. influenzae) 

25359682 
25359489 
25359709 
25359628 
25359414 
25359389 
21536216 
25359445 

HifE (H. influenzae) 
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MrkD (K. pneumoniae) 
FHA (B. pertussis) 
Pertactin (B. pertussis) 
YadA (Y. enterocolitica) 

SpaP (S. mutans) 



13506868 

13506870 

13506872 

13506874 

13506876 

3688787 

3688790 

3688793 

2126301 

1170264 

1170265 

533127 

535169 

3025668 

3025670 

3025672 

3025674 

642038 

127307 

17154501 

33571840 

10955604 

4324391 

28372996 

23630568 

32470319 



26007028 
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PAc (S. mutans) 

SspA (Streptococcus gordonii) 



CshA (Streptococcus gordonii) 
CshB (Streptococcus gordonii) 
ScaA (Streptococcus gordonii) 
SspB (Streptococcus gordonii) 



SpaA (Streptococcus sobrinus) 
PAg (Streptococcus sobrinus) 



Protein F (Streptococcus pyogenes) 
PsaA (Streptococcus pneumoniae) 



CbpA^ / SpsA / PbcA/ PspC 
(Streptococcus pneumoniae) 
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47267 

129552 

25990270 
1100971 

457707 

18389220 

310633 

25055226 
3220006 

546643 

217036 
47561 

19224134 

18252614 

7920456 

7920458 

7920460 

7920462 



14718654 
2425109 
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FimA (Streptococcus parasanguis) 
SsaB (Streptococcus sanguis) 
EfaA (Enterococcus faecalis) 
FnbA (Staphylococcus aureus) 
FribB (Staphylococcus aiireus) 



BabA (Helicobacter pylori) 
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2576331 
2576333 
3153898 
9845483 
19548141 

97883 

97882 . 

493017 

120457 

581562 

21205592 

13702452 

13309962 
13309964 
13309966 
13309968 
13309970 
13309972 
13309974 
13309976 
13309978 
13309980 
13309982 
13309984 
13309986 
13309988 
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13309990 
13309992 
13309994 

Advantages: 

L The method helps in discovering putative adhesins, which are of great 
importance in drug discoveries and preventive therapeutics. 

2. The method is useful in predicting the adhesive nature of even unique proteins, 
5 because it is independent of the homology of the query proteins with other 

proteins. 

3. This method is easy to use. For calculating the output, only the amino acid 
sequence is required as input. No other information is required to get the 
information about its adhesive nature. 
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Claims 

1. A computational method for identifying adhesin and adhesin-Iike proteins, said 
method comprising steps of: 

a. computing the sequence-based attributes of protein sequences using five 
5 attribute modules of a neural network software, wherein the attributes 

are, (i) amino acid frequencies, (ii) multiplet frequency, (iii) dipeptide 
frequencies, (i v) charge composition, and (v) hydrophobic composition, 

b. training a artificial neural Network (ANN) for each of the computed five 
attributes, and 

1^ c- identifying the adhesin and adhesin-like proteins having probability of 

being an adhesin (Pad) as > 0.5 1 . 

2. A method as claimed in claim 1, wherein the protein sequences are obtained 
firom pathogens, eukaryotes, and multicellular organisms. 

3. A method as claimed in claim 1, wherein the protein sequences are obtained 
15 firom the pathogens selected from a group of organisms comprising Escherichia 

coli^ Haemophilus influenzae^ Helicobacter pylon. Mycoplasma pneumoniae, 
Mycobacterium tuberculosis, Rickettsiae prowazekii, Porphyromonas 
gingivalis. Shigella flexneri. Streptococcus mutans. Streptococcus pneumoniae. 
Neisseria meningitides. Streptococcus pyogenes, Treponema pallidum and 
20 Severe Acute Respiratory Syndrome associated human coronavirus (SARS ). 

4. A method as claimed in claim 1, wherein the method is a non-homology 
method. 

5. A method as claimed in claim 1, wherein the method uses 105 compositional 
properties of the sequences. 

25 6. A method as claimed in claim 1, wherein the method shows sensitivity of at 
least 90%. 

7. A method as claimed in claim 1, wherein the method shows specificity of 
100%. 

8. A method as claimed in claim 1, wherein the method helps identifies adhesins 
30 from distantly related organisms. 

9. A method as claimed in claim 1, wherein the neural network has multi-layer 
feed forward topology, consisting of an input layer, one hidden layer, and an 
output layer. 
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A method as claimed in claim 9, wherein the number of neurons in the input 
layer are equal to the number of input data points for each attribute. 
A method as claimed in claim 1, wherein the "Pad" is a weighted linear sum of 
the probabilities from five computed attributes. 

A method as claimed in claim 1, wherein each trained network assigns a 
probability value of being an adhesin for the protein sequence, 
A computer system for performing the method of claim I, said system 
comprising a central processing unit, executing SPAAN program, giving 
probabilities based on different attributes using Artificial Neural Network and in 
built other programs of assessing attributes, all stored in a memory device 
accessed by CPU, a display on which the central processing unit displays the 
screens of the above mentioned programs in response to user inputs; and a user 
interface device. 

A set of 274 annotated genes encoding adhesin and adhesin-like proteins, 
having SEQ ID Nos. 385 to 658. 

A set of 105 hypothetical genes encoding adhesin and adhesin-like proteins, 
having SEQ ID Nos. 659 to 763. 

A set of 279 annotated adhesiti and adhesin-like proteins of SEQ ID Nos. 1 to 
279. 

A set of 105 hypothetical adhesin and adhesin-like proteins of SEQ ID Nos. 280 
to 384. 

A fully connected multilayer feed forward Artificial Neural Network based on 
the computational method as claimed in claim 1 , comprising of an input layer, a 
hidden layer and an output layer which are connected in the said sequence, 
wherein each neuron is a binary digit number and is connected to each neuron 
of the subsequent layer for identifying adhesin or adhesin like proteins, wherein 
the program steps comprise:- 

[a] feeding a protein sequence in FASTA format; 

[b] processing the sequence obtained in step [a] through the 5 modules 
named A, C, D, H and M, wherein attribute A represents an amino acid 
composition, attribute C represents a charge composition, attribute D 
represents a dipeptide composition of the 20 dipeptides [NG, RE, TN, 
NT, GT, TT, DE, ER, RR, RK, RI, AT, TS, IV, SG, GS, TG, GN, VI 
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and HR], attribute H represents a hydrophobic composition and attribute 
M represents amino acid frequencies ia multiplets to quantify 5 types of 
compositional attributes of the said protein sequence to obtain numerical 
input vectors respectively for each of the said attributes wherein the sum 
of numerical input vectors is 105; 

[c] processing of the numerical input vectors obtained in step [b] by the 
input neuron layer to obtain signals, wherein the number of neurons is 
equal to the number of numerical input vectors for each attribute; 

[d] processing of signals obtained from step [c] by the hidden layer to obtain 
synaptic weighted signals, wherein the optimal number of neurons in the 
hidden layer was determined through experimentation for minimizing 
the error at the best epoch for each network individually; 

[e] delivering synaptic weighted signals obtained from step [d] to the output 
layer for assigning of a probability value for each protein sequence fed 
in step [a] as being an adhesin by each network module; and 

[f| using the individual probabilities obtained from step [e] for computing 
the final probability of a protein sequence being an adhesin denoted by 
the Pad value, which is a weighted average of the individual probabilities 
obtained from step [e] and the associated fraction of correlation which is 
a measure of the strength of the prediction. 
A network as claimed in claim 18, wherein the input neuron layer consists of a 
total of 105 neurons corresponding to 105 compositional properties. 
A network as claimed in claim 18, wherein the hidden layer comprises of 
neurons represented as 30 for amino acid frequencies, 28 for multiplet 
frequencies, 28 for dipeptide frequencies, 30 for charge composition and 30 for 
hydrophobic composition. 

A network as claimed in claim 18, wherein the output layer comprises of 
neurons to deliver the output values as probability value for each protein 
sequence. 
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The Neural Network architecture 
Figurel 
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Figure 3 (a) 
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Figure 3(c) 
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Applicat-ion Project 

<120> Title : 
<130> AppFileRef erence : 
<140> CurrentAppNutnber : 
<141> CurrentPilingDate 

Sequence 



10 <213> Organi sniName : Escherichia coli 0157:H7 
. <400> PreSeguenceString : 
MINLSKEATV GKALTPIAIL MMLSFPVASQ AAGLVIKNGT VYNAHGVPW DINKPNGSGL 60 
SHNIWDNIiNV DKNGWFNNS ANESSTSLAG NIQGNS^IIiTS GSAKVILNEV TSKNPSTING 120 

MMFV.vjjjK.-r i7iA-:'?y.c^T vNCGGCii'-rv rsj^z-TTt^-^w iQDDKhHcrrs ^'jw-Tir:.",": -.zc; 

15 LiDNASPTEIL SRNVWNGKV SADELNWA6 NNYVNAAGQV TGSVSATGSR NGYSVDVAKL 240 
GGMYAJTKISL VSTEKGVGVR NLGVIAGGVN GVSIDSKGNL LNSNAQIQSA STUTLTTNGT 3 00 

XiDNTTGTVTS VGTISLNTNK NTIVNTRAGN ISTMGDIYVN SGTIDNTNGK LAAAGMLAVD 360 
TNNATIiINSG KGSSVGIEAG LVALKTGTLN NSNGQIRGGY VGLESAAL3SIN NNGDIQTTGD 420 
lAIISKTGNVD NNKGLIRSST GHIVIGAAGS VNNGSTKTAD TGSSDSLGII ADTGVEIGAN 480 

20 KTINNNGGQIA SNGNVSLSSY STIDDYAGKI LSNSKVIIKG SSLRNDTGGI SGKQGIEVAV 540 
GGSLTNNIGV ISSEEGDISL LANSVDNHGG FMMGQNITME SMSGVNNNTA LIVASKKLKI 600 
KTARGSIENRD GNNFGKTAYGL YFGMPQQTGG MVGKEGIELS GQNIYlilNNSR LIAEDGPIiTL 660 
QAQNTFDNTR AIiVTSGADAS IQVGGTYYNN YATTWSAGNL DIDATTLQNS SSGTMIDNNA 720 
TGFIASDKNL SLEWNSLTN YGWIS6KGDV DVTVNNGNLY NRNTIAAEKG LDIAALNGIE 780 

25 NWKDISAGGD LTMNTNRHVT NNSNSNMVGQ NIVINAVNDI NISIRGNIVSDA DLNVTTKGNL 840 
YNYLYMVGYG DIALSANSVA NNNATIEAT6 DLIIDSKGNV GNNRGNLHAIi NGVLSVKGNN 900 
LNNDKTGEIRG YGDVTLALTG NYDSYKGSLT SETGDVTLTA NIVDNAYGLI AGENVSVDAK 960 

STIYNNTALI AANKKLVINA GGNLENRDGN NFLR3SINGALF GITDNVGGXV GKEGVTLSAQ 1020 

13VYNNNSSII AEKTGPLNIiLS RGTIiDNTRAL LSSGADAIIR AA6TFYNWYA TTYSAGNLDV 108 0 

30 YAASLWNASD GRLEDNTATG VIASDKNLDL SVDNSVTNYG WISGKGDVHF ITVLKGTIiYNR 1140 

MAXAADNALT INALNGVEHF KDIVAGTALT IDTQKYVTNN SNSNMLGQTI AINAVNDINN 12 00 

RGNIVGDYSL GVKTTGNIYN YLlSIMIiSYGVA GVSANKVTNS GKDAVLGGFY GLALEANETD 1260 

NTGTIVGM X268 
<212> Type : PRT 

35 <2ll> Lengtn : 1268 

SequenceKame : SEQ XD 1 
SequenceDescription : 



Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MNTIHLRCLF RMNPLVWCLW ADVAAKLRSL KRYSVFTFQR MKFMNRTSPY YCRRSVLSLL 60 

ISALIYAPPG MAAFTPDVIG WNDETVDGS QRVDERGTTN NTHXINHGQQ NVYGGVSNGS 120 

45 LIESGGYQDV GRHNNYVGQS NNTTXNGGRQ SIHDGGXSTG TIIESGNQDV YKGGISNGTT 180 

IKGGASRVEG GSAUGTLIDG GSQIVKVQGH ADGTTINKSG SQDWQGSLA TKfTTINGGRQ 240 

YVEQSTVETT TIKNGGEQRV YESRALDTTX EGGTQSLNSK STAKNTQIYS GGTQXXDNTS 300 

SSDVIEVYSG 6VLDVSGGTA TNVTQHDGAX LKTNTNGTTV SGTNSEGAPS IHNHVADNVL 360 

LENGGHLDIN AYGSANKTII KDKGTMSVLT 3S3AKADATRID NGGVMDVAGN" ATNTIXNGGT 420 

50 QNIKTKfYGXAT GTNXNSGTQN XKSGGKADTT XXSSGSRQW EKDGTAIGSN ISAGGSIiXVY 480 

TGGXAHGVNQ ETGSALVANT GAGTDIEGYN KLSHFTITGG EANYWLENT GELTWAKTS 540 

AKNTTIDAGG KLIVQKEAKT DSTRLNNGGV LEVQDGGBAK HVEQQSGGAL lASTTSGTLI 600 

EGTNSYGDAF YIRKTSEAKNV VLENAGSLTV VTGSRAVDTI INANGKMDVY GKDVGTVLNS 660 

AGTQTIYASA TSDKANIKGG KQTVYGLATE ANIESGEQIV DGGSTEKTHX NGGTQTVQNY 720 

55 GKAINTDIVS GLQQIMANGT AEGSIINGGS QIVNEGGIiAE NSVLNDGGTL DVREKGSATG 780 

XQQSSQGALV ATTRATRVTG TRADGVAFSI BQGAANNILL ANGGVLTVES DTSSDKTQVN 840 

TGGREIVKTK ATATGTTLTG GEQIVEGVAN ETTINDGGIQ TVSANGEAXK TTINEGGTLT 900 

VNDNGKATDX VQNSGAALQT STANGXEXSG THQYGTFSXS GNLATNMLLE NGGNLLVLAG 960 

TEARDSTVGK GGAMQNQGQD SATKVNSGGQ YTLGRSKDEP QALARAEDLQ VAGGTAIVYA 1020 

60 GTLADASVSG ATGSLSLMTP RDNVTPVKLE GAXRITDSAT LTIGNGVDTT LADLTAASRG 1080 

SWLJSrSNNSC AGTSNCEYRV NSLLLNDGNV YLSAQTAAPA TTNGXYNTLT TNELSGSGNF 1140 

YIiHTNVAGSR GDQLWNNNA TGNFKIFVQD TGVSPQSDDA MTLVKTGGGD ASFSLGNTGG 1200 

PVDLGTYEYV LKSDGNSNWN LTNDVKPNPD PNPNPNPNPK PDPKPDPKPD PKPDPTPEPT 1260 

PTPVPEKRIT PSTAAVLNMA ATLPLVFDAE LNSXRERLNI MKASPHKNWV WGATYNTR3SIN 1320 

65 VTTDAGAGPE QTLTGMTVGX DSPNDIPEGI ATLGAFMGYS HSHIGFDRGG HGSVGSYSLG 1380 

GYASWEHESG FYLDGWKLKT RFESNVAGKM SSGGAANGSY HSNGLGGHIE TGMRPTDGNW 1440 

NLTPYASLTG PTADNPEYHL SNGMESKSVD TRSIYREIiGA TLSYNMRLGN GMEIEPWLKA 1500 
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AVRKEFVDDN RVKVNKDGNF VNDLSGRRGI YQAGIKASFS STLSGHLGVG YSH6A.GVESP 1560 
WNAVAGVNWS F 1571 
<212> Type : PRT 
<211> Length : 1571 
5 SequenceName : SEQ ID 2 

SequenceDe script ion : 

Sequence 



10 <213> OrganistnName : Escherichia coli 0157 :H7 
<400> PreSequenceString : 

MAVKISGVLK DGTGKPVENC TIQLKARRNS ATVWNTVAS ENPDEAGRYS MDVEYGQYSV 60 

ILLVEGFPPS HAGTITVYED SQPGTLNDFL GAMTEDDVRP EALRRFELMV EEVARNASAV 12 0 

7-i,QjSlTAA?^ r k:S i^-C.^.\finT'jM.^ AATPIi^.TDAAL^ SA'-^"^-TSAG QAASSAQSAS SSAGTASTICA J 30 

15 TEASKSAAAA ESSKSAAATS AGAAKTSETN AAVSQQSAAT SASTATTKAS EAASSARDAS . 24 0 

ASKEAAKSSE TSAASSASSA ASSATAAGNS AKAAKTSETN AKSSETAAEQ SASAAAGSKT 3 00 

AAALSASAAS TSAGQASASA TAAGKSAESA ASSASTATTK AGEATEQASA AASSASAAKT 3 60 

SETMAKASET SAESSKTAAA SSASSAASSA SSASASKDEA TRQASAAKSS ATTASTKATE 42 0 

AAGSATAAAQ SKSTAESAAT RAETAAKRAE DIASAVALED ASTTKKGIVQ LSSATNSTSE 4 80 

20 SLAATPKAVK AAYELANGKY TAQDATTAQK GIVQLSMATN STSEMLAATP KSVKAAYDLA 54 0 

NGKYTAQDAT TAQKGIVQLS SATNSASETL AATPKAVKAA NDNANGRVPS ARKVNGKALS 600 

SDITLTPKDI GTLNSTTMSF SGGAGWFKLA TVTMPQASSV VSITLIGGAG FNVGSPQQAG 660 

ISELVLRAGN GNPKGITGAL WQRTSTGFTN FAWVNTSGDT YDIYVAIGNY ATGVNIQWDY 72 0 

TSNASVTIHT SPAYSANKPE GLTDGTVYSL YTPSEQFYPP GAPIPWPSDT VPSGYALMQG 78 0 

25 QTFDKSAYPK LAAAYPSGVI PDMRGWTIKG KPASGRAVLS QEQDGIKSHT HSASASSTDL 840 

GTKTTSSFDY GTKSTNNTGA HTHSVSGTAA SAGNHTHSVT GASAVSQWSQ NGSVHKWSA 900 

ASVNTSAAGA HTHSVSGTAA SAGAHAHTV6 IGAHTHSVAI GSHGHTITVN AAGNAENTVK 960 

NTIAFNYIVRL A 971 
<2X2> Type : PRT 

30 <211> Length ; 971 

SequenceName : SEQ ID 3 
SequenceDe script ion : 

Sequence 
35 — — 

<213> OrganisnOSratne : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MKRVITLFAV LLMGWSVNAW SFACKTANGT AIPIGGGSAN VYVNLAPAVN VGQNLWDLS 60 
TQIFCHNDYP ETITDYVTLQ RGSAYGGVLS NFSGTVKYSG SSYPFPTTSE TPRWYNSRT 120 
40 DKPWPVALYL TPVSSAGGVA IKAGSLIAVL ILRQTKNYNS DDFQFVWNIY ANNDVWPTG 180 
GCDVSARDVT VTLPDYPGSV PIPLTVYCAK SQNLGYYLSG TTADAGNSIF TNTASFSPAQ 240 
GVGVQLTRNG TIIPANNTVS LGAVGTSAVS LGLTANYART GGQVTAGNVQ SIIGVTFVYQ 3 00 

<212> Type : PRT 
45 <211> Length : 3 00 

SequenceName : SEQ ID 4 
SequenceDe script ion : 

Sequence 

50 

<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MGYVTGGLPM KNNRAWALIS GLILFSGTAP AADNLHFTGN LLGKSCTPVI NGNLLAEIHF 60 

PTIAASDLMQ RGQSDRVPLV FQLKDCKSTT AFNVKVTLMG TEDTDLPGFL SIDSSSSATG 120 
55 VGIGIETAGG AAVPINSTTG ASFPLNQGNN SVNPNAWLQT VNGRNVTSGD FTATMTVTFE 180 

YF 182 

<212> Type : PRT 

<211> Length : 182 

SequenceName : SEQ ID 5 
60 SequenceDescription : 

Sequence 



<213> OrganismName : Escherichia 
65 <4 00> PreSequenceString : 

MKRHLNTSYR LVWNHITGTL WASELARSR 
ETVNDGTLTN HDNQIVFGTA NGMTIST6LE 



coli 0157:H7 

GKRAGVAVAL SLAAVTSVPA LAADKWQAG 60 
LGPDSEENTG GQWIQNGGIA GNTTVTTNGR 120 
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QWLEGGTAS DTVIRDGGGQ 
GGLATGTIIN TGAEGGPDSD 
DQSVHGRALN TTLNGGYQYV 
GGEATAVTQN TGGALVTSTA 
5 TLVDDGGTLA VSAGGKATSV 
NGGSFTVNAG GQAGNTTVGH 
EIRFDNQTTP NAALSRAVAK 
GGQATGKTWL AFTNVGNSNL 
NRDSDEDWYL RSENAYRAEV 

10 GHLGHDNNGG lARGATPESS 
DGSRAGTVRD DAGSLGGYLN 
LETGLPFSIT DNLMLEPQLQ 
TFGEGTSSRD TLRDSAKHSV 
MGTSLDLQAG XjETJiIRENXT 

15 <212> Type : PRT 

<211> Length. : 949 

SequenceName : SEQ ID 6 
SequenceDescription : 

20 Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<40 0> PreSequenceString : 

MKKWHYIFCI ILFHLGLPCG YAANDGTCAT RGGTHTLSLN FPLTTVSAAN IJJVPGNTLIDI 60 
25 AlsTATSSEMYS VLCNCDSKHS NGAYHEIYYT ADPAPGMVYS TTASGIiAFYY LNEYVDVGTK 12 0 

ISVIjNAGYTA VPFEHVSNQA TTTDHTCQGN KTTAVGVSLK TGADAKISFR IKRSINGTW 180 
IPITDIALLY ANISSTTTRG EAIAKVRISG SLTAPQSCQI NAGQVIYFDF DTIPASEFSS 240 
TAGQAITSRK ITKTVSIECT GMGYERTQKV DASFTGTNRS SDDTMVATDN ADVGIKIYNK 3 00 

SNAEVSVNNG KLPADMGNTT IFGRKNGSVT FSAAPASFTG ARPQPGVFNA TATLTIEFVN 3 60 

30 

<212> TypG : PRT 
<211> Length : 360 

SequenceName : SEQ ID 7 

SequenceDescription : 

35 

Sequence 



<213> OrganismName : Escherichia coli 0157 :H7 
<400> PreSequenceString : 

40 MSRYKTGHKQ PRFRYSVLAR CVAWANISVQ VLFPLAVTFT PVMAARAQHA VQPRLSMGNT 60 

TVTADNNVEK NVASFAANAG TFLSSQPDSD ATRNFITGMA TAKANQEIQE WLGKYGTARV 12 0 

KLNVDKDFSL KDSSLEMLYP lYDTPTNMLF TQGAIHRTDD RTQSNIGFGW RHFSGNDWMA 180 

GVNTFIDHDL SRSHTRIGVG AEYWRDYLKL SANGYIRASG WKKSPDIEDY QERPAFGWDI 240 

RAEGYLPAWP QLGASLMYEQ YYGDEVGLFG KDKRQKDPHA ISAEVTYTPV PLLTLSAGHK 3 00 

45 QGKSGENDTR FGLEVNYRIG EPLAKQLDTD SIRERRVLAG SRYDLVERISTW NIVLEYRKSE 3 60 

VIRIALPERI EGKGGQTLSL GLWSKATHG LKNVQWEAPS LLAEGGKITG QGSQWQVTLP 420 

AYRPGKDNYY AISAVAYDNK GNTSKRVQTE WITGAGMSA DRTALTLDGQ SRIQMLANGN 480 

EQKPLAHIjSLR DAEGQPVTGM KDQIKTELTF KPAGNIVTRS LKATKSQAKP TLGEFTETEA 540 

GVYQSVFTTG TQSGEATITV SVDGMSKTVT AEIiRATMMDV ANSTLSANEP SGDWADGQQ 600 

50 AYTLTLTAVD SEGNPVTGEA SRLRFVPQDT NGVTVGAISE IKPGVYSAAV SSTRAGNWV 660 

RAFSEQYQLG TLQQTLKPVA GPLDAAHSSI TLNPDKPWG GTVTAIWTVK DAYDNPVTSL 72 0 

TPEAPSLAGA AAEGSTASGW TNNGDGTWTA QITLGSTAGE LEVMPKLNGQ NAAANAAKVT 78 0 

WADALSSNQ SKVSVAEDHV KAGESTTVTL VAKDAHGNAI SGLALSASLT GTASEGATVS 840 

SWTEKGNGSY VATLTTGGKT GELRVMPLFN GQPAATEAAQ LTVIAGEMSS ANSTLVADNK 900 

55 APTVKTTTEL TFTVKDAYGN PVTGLKPDAP VFSGAASTGS ERPSAGNWTE KGNGVYVSTL 960 

TLGSAAGQLS VMPRVNGQNA VAQPLVLNVA GDASKAEIRD MTVKVNNQLA NGQSANQITL 1020 

TWDTYGHPL QGQEVTLTLP QGVTSKTGNT VTTNAAGKAD IELMSTVAGE HNISASVNGA 1080 

QKTVTVKFNA DASTGQANLQ VDAAAQKVAN GKDAFTLTAN VEDKNGNPVP GSLVTFNLPR. 1140 

GVKPLTGDNV WVKANDEGKA ELQWSVTAG TYEITASAGN SQPSNTQTIT FVADKATATV 12 00 

60 SGIEVIGNYA LADGNAKQTY KVTVTDANNN LLKDSEVTLT ASPANLVLTP NGTAKTNEQG 12 60 

QAIFTATTTV AAKYTLTAKV SQADGQESTK TAESKFVADD TNAVLTASSD VTSLVADGIS 1320 

TAKLEVTLMS ANNPVGGNMW VDIKTPEGVT EKDYQFLPSK NDHFVSGKIT RTFSTSKPGV 13 80 

YTFTFNALTY GGYEMKPVTV TITAVDADTA KGEEAMN 1417 
<212> Type : PRT 

65 <211> Length : 1417 

SequenceName : SEQ ID 8 
SequenceDescription : 



SLNGLAVNTT LNNRGEQWVH EGGVATGTII NRDGYQSVKS 180 

NSYTGQKVQG TAESTTINKKT GRQIILFSGL ARDTLIYAGG 240 

HRDGLALNTV INEGGWQWK AGGAAGNTTI ISTQNGELRVHA 3 00 

ATVIGTMRLG NFTVENGKAD GWLESGGRL DVLESHSAQKT 3 60 

TITSGGALIA DSGATVEGTN ASGKFSIDGT SGQASGLLLE 420 

RGTLTLAAGG SLSGRTQLSK GASMVLNGDV VSTGDIVNAG 480 

SNSPVTFHKL TTTNLTGQGG TINMRVRLD6 SNASDQLVIlsr 540 

GVATTGQGIR WDAQNGATT EE6AFALSRP LQAGAFNYTL 600 

PLYTSMLTQA MDYDRILAGS RSHQTGVNGE NNSVRLSIQ6 660 

GSYGFVRLEG DLLRTEVAGM SLTTGVYGAA GHSSVDVKDD 720 

LVHTSSGLWA DIVAQGTRHS MKASSDNNDF RARGWGWLGS 780 

YTWQGLSLDD GQDNAGYVKF GHGSAQHVRA GFRLGSHNDM 840 

SELPWWWVQ PSVIRTFSSR GDMSMGTAAA GSNMTFSPSR 900 

LGVQAGYAHS VOGGSAEGYN GQATLNMTF -v S49 
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Sequence 



<213> OrganisraName : Escherichia coli 0157:H7 
5 <400> PreSequenceString : 

MARGWASSEA SGAMTDWLNN PGTARISLGV DEDFSIiKNSQ FDFLHPWYDT PDYLLFSQHT 60 
LHRTDDRTQI NT6LGWRHFT SSWMSGINLF FDHDLSRYHS RAGLGAEYWR DYLKIiSSNAY ' 120 

IGLTGWRSAP ELDNDFEARP ANGWDLRAEG WLPAWPQLGG KLVYEQYYGD EVALFDKNDR 180 

QSNPHAITAG LNYTPFPLLT LSAEQRQGKQ 6E1SFDTRFAVD LTWQPSSSMQ KQLNPDEVAG 240 

10 RRSIiAGSRYD LIDR3SINNIVL EYRKKEIiIRL SLLDPVKGKS GEIKPLVSSL QTKYALKGYIT 300 

lEAAALEAAG GKVSTSGKDI TVTLPGYRPT NTPETDNTWS IDVTAEDVKG NLSRHEQSMV 3 60 

VIQAPTLSQK DSIiLSVNPLT VAADKKSTTT LTVTAHDSDG TPVPGLAIiQT RSEGVQDITL 420 

SDWTDNGDGS YTQILTAGTT SGSVTIiTPQI NGESAVKESI WNIVPWSS RDHSSITIDN 480 

•VSYYAGDDIK V?<VELKDDSN QPVAYQKEEL VKAVTVEITSK PGATIVWHEE QPGVYAANYF 51G 

15 AYKQGTALRA QLSLHNWNAP LQSHIYNIBA NQNKARVATL SATNNDVYAD KKTFNTLTIN 600 

VTDESDHPIiT NHQVTFKNEK GSAEFVEPPQ QNTDAYGVAT INMVSQVAEE NTISATLPNG 660 

FSQRIIAKFV SDSSTPKFKQ LVADPDTIIA GNSQGSTLTA IITDPHNNPL KDMKVNFVAP 720 

GGSQIiDNTTA TTDQSGIVRV HLTSSKAGSY SVDASLEVDK NIHQSVTITV VPNREQSVMT 780 

LNAGSGSAIA NNTNIVTLTA SVKDVYGHPL PDEDVKFTLP ASMTGNFTLS SETARTDANG 840 

20 DAWTLRGTK AGEFTVTATL TRJSnSTTVAYQQ VTFIGDTNSA QLQPLTASLN SIVAGNSTGS 900 

TLTATILDAY QNPLKDQLVT FQSNDVTLSE TEVTTNTLGQ ATVTMTSNXA GQHNVWSRK 960 

AQASDNKTFS LSVLPDESSA KVISITGAEK TITVGENITL RILVQDAFNN VIAGQRVRLS 1020 

AQPTTNITIG DTAYTDNNGY AYVNLIiSTQP GVYQVTATLD NNSSSKVDVN VANGKLELTS 1080 

SKPETTVHNS EGITLTATAR NARGELMPGQ IITFSVTPEG ATLSNTGEVL TDQSGQAKVT 1140 

25 LTSDKVNVYT VTAIMGKDVP VQSQVTVAVK ADAKTAHWS WASPDTITA DGIDSSTITS 1200 

RVEDDYGFPV EGVDISHGLD TKGSPWNIP TTRTDQSGQV TATITSTXiAE TLTVNVQVPG 1260 

TANQSATITL VAGTADESKS IIiKSDVDTLK ADYQQSAKLT LTLQDKYGNP IVTSDHLEFV 1320 

QSGPFVNFLK LSDIDYSQRN YGEYTVTVT6 GKEGTATLIP MLNGVHQANL SISLNLIQSI 1380 

KEMSGHVTAN" NHTFSTAKFP SEGFAGAYYT LNNDNFEAGK TVDDYMFSSS QGWVSVDAS6 1440 

30 KVSFANIGDQ TSVTISAVPR QGGTTYQTLI KLKGWWVKITG NHTNIWLiAAH ALCHAKNDGY 1500 

NLPGXTHIiTS GENKRTQGSL YGEWGNVGAF SSNSQFTPGA YWTSESDDYS RHYYVQMLTG 1560 

MTGSDADSSP QLTACRKSL 1579 
<212> Type : PRT 
<211> Length : 1579 

35 SequenceNarae : SEQ ID 9 

Sequenc^escription : 

Sequence 



40 <213> OrganisitilTame : Escherichia coli 0157 :H7 
<400> PreSequenceString : 

MITHGCYTRT RHKHKLKKTIi IMLSAGLGLF FYVNQNSFAN GENYFKLGSD SKLLTHDSYQ 60 

NRLFYTLKTG ETVADLSKSQ DINLSTIWSL NKHIiYSSESE MMKAAPGQQI ILPIiKKLPFE 120 

YSALPLLGSA PLVAAGGVAG HTNKLTKMSP DVTKSNMTDD KALNYAAQQA ASLGSQLQSR 180 

45 SLNGDYAKDT ALGIAGNQAS SQLQAWLQHY GTAEVNLQSG KINPDGSSLDF LLPFYDSEKM 240 

IiAFGQVGARY IDSRFTANLG AGQRFFLPAN MLGYNVFIDQ DFSGDNTRLG IGGEYWRDYF 300 

KSSVNGYPRM SGWHESYNKK DYDERPANGP DIRPNGYLPS YPALGAKLIY EQYYGDNVAL 360 

PNSDKLQSNP GAATVGVNYT PIPLVTMGID YRHGTGNEND LLYSMQFRYQ FDKSWSQQIE 420 

PQYVNELRTL SGSRYDLVQR NNNIILEYKK QDXLSIiNIPH DINGTEHSTQ KIQLIVKSKY 480 

50 GLDRIVWDDS ALRSQGGQIQ HSGSQSAQDY QAILPAYVQG GSNIYKVTAR AYDRNGNSSN 540 

NVQLTITVLS NGQWDQVGV TDFTADKTSA KADNADTITY TATVKKNGVA QANVPVSFNI 600 

VSGTATLGAN SAKTDANGKA TVTLKSSTPG QVWSAKTAE MTSALNASAV IFFDQTKASI 660 

TEIKADKTTA VANGKDAIKY TVKVMKNGQP VNNQSVTFST NFGMFNGKSQ TQATTGNDGR 720 

ATITIiTSSSA GKATVSATVS DGAEVKATEV TPFDELKIDN KVDIIGNNVR GELPNIWLQY 780 

55 GQFKLKASGG DGTYSWYSEN TSIATVDASG KVTLNGKGSV VIKATSGDKQ TVSYTIKAPS 840 

YMIKVDKQAY YADAMSICKN LLPSTQTVLS DIYDSWGAAN KYSHYSSMNS ITAWIKQTSS 900 

EQRSGVSSTY NIiITQNPLPG VNVNTPNVYA VCVE 934 
<212> Type : PRT 
<211> Length : 934 

60 SequenceName : SEQ ID 10 

SequenceDe script ion : 

Secjuence 



65 <213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MLVLSESFKN KLLPMKTGYMK GGSDSGSKAQ ARATEKGIBL QREMWQTNMQ NLAPFTPLAQ 60 
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10 



QYVSQLQNLS SLQGQGQALN QYYNSQQYKD LAGQARYQSL AAAEATGGLG STATGNQLAA 120 
XAPTLGQNWL SGQMNNYNNL ANIGLGALTG QANAGQNYAN NVSQLYQQQA AASAANANKP 180 
SGLQSFATGA IGGAASGAMI GSAVPVIGTG IGAIiAGGVIG GLGSLP 226 
<212> Type : PRT 
<:211> Length : 226 

SequenceName : SEQ ID li 

SequenceDescription : 

Sequence 



<:213> OrganismName : Escherichia coli 0157:H7 
<400> PreSeqtuenceString : 

MKKILSGLIL LLCCPYGPAA NGD6ATHMSN LSFGPLTVAA ANNHSGYNIP EALSNTTGTY 60 
E>VRGHCDDTH GGPGQQTAFF PIPYTGDAAP GLVLERTLivTG Lir£TALNDYL' SVGVTIPilN ' ' 1*20 

15 KTQYAAIPFEH LSKTQSTSPQH TCGAGNNGST VNLDSGRSAK IiSFYVRHSIT 6TVTIPTTBV 180 

AWIiYAGMSDH FPKTTPVSKV TIRGQIiTAPQ NCELTPNQSI DVDFQKINSA EFSSTAGSII 240 

AERKIKTEVT VSCTGMEDVR STEWSASMI AANRSADATM IVTSNPDVGI KIFDKNDRPV 3 00 

KrVDGGNIiPAD MGAISRLGKT DGSVTFYSAP ASIiTGAKPAP DNGPTATATL VIEPTN 3 56 

20 <212> Type : PRT 

<211> Length : 356 

SequenceKTame : SEQ ID 12 
SequenceDescription : 

25 Sec[uence 



<213> OrganisitiKTame : Escherichia coli 0157:H7 
<:400> PreSequenceString : 

MNKIYRLKWN RSRNCWSVCS BLGSRVKGKK SRAVLISAIS LYSSLVFADD VIVKQDKTID 60 

30 FGKENQSIDY RITVTDNANL VINATDTSRP RLTLASGGGL DITGGKVTIN 6PLNFLLKGT 120 

GFIiNVSNAGS ELYADDLYBS NSGMRHDR6Y FNVSNGGKIH VKGTSRLTYL QGNVSGEGSQ 180 

VNSETFFMGV YGSYGGNQYL SViaNGGEVNA RKQISLGYYD QVSDTTLAVS EGGKISAPTI 240 

SLSTNSEIiAL GAQE6SAAKA AGIIDAEKIB FVWAKTSEKK ITLNHTDKDA TISADIVSGS 3 00 

EGLGYINALlSr GTTYLTGDNS AFSGKVKIEQ NGALGITQKTI GTAEINNRGK LHLKADDSMT 3 60 

35 KANKISGNGT ISXDSGTVEL TGNNYAFSGY IDVASGAVAV ISEDKNTIGRA ELDVDGKLQI 420 

MANKDWVFDN DLEGRGIVEI NMGNHEFSFD EPAYTDWFQG SLAFQNTTFN LEKMAEFLQK 480 

GGITAGQGSL VTVGK6AHSI STLGFSGGTV DFGALTAGAQ MTEGTVNVSK TLDLRGEGVI 540 

QVSDSDWRS VSRDIDSALS LTEVDDGNST IKLVDAQGAE VLGDAGNLQL QDKNGQILSS 600 

SAQRDIQQMG QKAAVGTYDY RLTSGVWNDG LYIGYGLTQL DLHATDSDAL VLSSNGKSEKT 660 

40 AADLSAKITG SGDLAFSSQK GQTVSLSNKD NDYTGVTDLR SGTLLLNNDN VLGNTHELRL 720 

AAETELDMNG HSQTVGTLNG SADSLLSLNG GSLTVTMGGT STGSLTGSGE LNIQGGTLDI 780 

AGDNSNIiTAW VNIANSANVL VSHAQGLGSA NVENNGTLAL KNSAEKRAAA SVNYALGGNL 840 

TlSrMGTLMTGM SGQQAGNVLV VKGNYHGNNG QLVMNTVLNG DDSVTDKLW EGDTSGTTAV 900 

TVNNAGGTGA KTLNGIELIH VDGKSEGEFV QAGRIVAGAY DYTLARGQGA NSGNWYLTSG 960 

45 SDSPELQPEP DPMPNPEPNP NPEPNPNPTP TPGPDLNVDN DLRPEAGSYI ANLAAANTMF 1020 

TTRLHERLGN TYYTDMVTGE QKQTTMWMRH EGGHNKWRDG SGQLKTQSNR YVLQLGGDVA 1080 

QWSQNGSDRW HVGVMAGYGN SDSKTISSRT GYRAKASVNG YSTGLYATWY ADDESRNGAY 1140 

LDSWAQYSWF DNTVKGDDLQ SESYKSKGFT ASLEAGYKHK LAEFNGSQGT RNEWYVQPQA 12 00 

QVTWMGVKAD KHRESNGTLV HSNGDGNVQT RLGVKTWLKS HHKMDDGKSR EFQPFVEVNW 1260 

50 L.HNSKDPSTS MDGVSVTQDG ARNIAEIKTG VEGQLNANLN VWGWGVQVA DRGYNDTSAM 1320 

VGIKWQF 1327 
<:212> Type : PRT 
<:211> Length : 1327 

SequenceName : SEQ ID 13 

55 SequenceDescription : 

Sequence 



<:213> OrganismName : Escherichia coli 0157:H7 
60 <400> PreSequenceString : 

MITMKKSVLT AFITWCATS SVMAADDNAI TDGSVTFNGK VIAPACTLVA ATKDSWTLP 60 
DVSATKLQTN GQVSGVQTDV PIELKDCDTT VTKNATFTFN GTADTTQITA FANQASSDAA 120 
TNVALQMYMN DGTTAIKPDT ETGNILLQDG DQTLTFKVDY lATGKATSGN VNAVTNFHIN 180 

182 

65 <212> Type : PRT 

<211> Length : 182 

SequenceName : SEQ ID 14 
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SequenceDe script ion : 
Sec[uence 



5 <213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MSKFVKTAIA ATMVMGAFAS TSTIAAGNNG TARFYGTIED SPCSIVPDDH KLEVDMGDIG 60 

SGILKNNGTS TPKAPQIHLQ DCVFDTQTTM TTTPTGNASS TNSGNYYTIY NTDTGAAFNN ' 120 

VSIiAIGDAQG TSYKSGAGIE QKIVNDTATN- KGKAKQTLDF KAWLVGAADA PDLGNFEANT 180 

10 TFQITYL 187 



<212> Type : PRT 

<211> Length : 187 

SequenceName : SEQ ID 15 
SecfuenceDescription : 

15 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

20 MRVIFLRKEY LSLLPSMIAS LFSANGVAAA IDLCQGYDIK ASCHASRQSL SGITQVWSIA 60 
DGQWLVFSDM TNNASGGAVF LQQGAEFTLS PENETGMTLF ANNTVSGEYN NGGAIFAKEN 12 0 

STLNLTDVIF SGNVAGGYGG AIYSSGTNDT GAIDLRVTNA VFRNNIANDG KGGAIYTINN 180 
DIYLSDDVFM NNQAYTSTSY SDGDGGAIDV TDNNSDSKHP SGYTIINNTA FTNNTAEGYG 240 
GAIYTNSATA PYLIDISVDD SYSQNGGVLV DENNSAAGYG DGPSSAAGGF MYLGLSEVTF 3 00 

25 DIADGKTLVI GNTENDGAVD SIAGTGLITK TGSGDLVLNA DNIJDFTGEMQ lENGEVTLGR 3 60 

SNSLMNVGDT HCQDDPQDCY GLTIGSIDKY QNQAELNVGS TQQTFAHSLT GFQNGTLNID 420 
AGGNVTVNQG SFAGTIEGAG QLTIAQNGSY VLAGAQSMAL TGDIWDAGA VLSLEGDAAD 480 
IiAALQDDPQS IVLNGGMLDL SDFSTWQSGT SYKDGLEVSG SSGTVIGSQD WDLAGGlSTDM 540 
HIGGDGKDGV YWIDAGDGQ VSLAlSfDNQYIj GTTQIASGTL MVSDNSQLGY THYNRQVIFT 60 0 

30 DKPQESVMEI TANVDTRSTT TEHGRDIEMR ADGEVAVDAG VDTQWGALMA DSSGQHQDEG 660 
STLTKTGAGT LELTASGTTQ SAVRVEEGTL QGDVADIFPY ASSLWVGDGA TFVTGADQDI 720 
QSIDATSSGT IDISDGTVLR LTGQDTSVAL NASLFNCDGT LVNATDGVTL TGELNTMLET 780 
DSLTYLSNVT VNGNLTNTSG AVSLQNGVAG DTLTVNGDYT GGGTLLLDSE LNGDDSVSDQ 840 
IiVMNGNTAGN TTVWNSITG IGEPTSTGIK WDFAADPTQ FQNNAQFSLA GSGYVNMGAY 900 

35 DYTLVEDNND WYLRSQEVTP PSPPDPDPTP DPDPTQDPDP TPDPEPTPAY QPVLNAKVGG 960 

YLNNLRAANQ AFMMERRDHA GGDGQTLNLR VIGGDYHYTA AGQLAQHEDT STVQLSGDLF 1020 

SGRWGTDGEW MLGIVGGYSD NQGDSRSSMT GTRADNQNHG YAVGLTSSWF QHGKQKQGAW 10 80 

LDNWLQYAWF SNDVSEHEDG VDHYHSSGII ASLEAGYQWL PGRGWIEPQ AQVIYQGVQQ 1140 

DDFTAANRAR VSQSQGDDIQ TRLGLHSEWR TAVHVIPTLD LNYYHDPHST EIEEDASTIS 12 00 

40 DDAVKQRGEI KVGVTGNISQ RVSLRGSVAW QKGSDDFAQT AGFLSMTVKW 1250 



<212> Type : PRT 

<211> Length : 125 0 

SequenceName : SEQ ID 16 
SequenceDescription : 

45 

Sequence 



<213> OrganismName : Escherichia coli 0157 :H7 
<40 0> PreSequenceString : 

50 MHSWKKKLIV SQLALACTLA ITSQANAATN DISGQTYNTF HHYNDATYAD DVYYDGYVGW 60 

NNYAADSYYN GDIYPVINNA TVNGVISTYY LDDGISTNTN ANSLTIKNST IHGMITSECM 120 

TTDCADDRAT GYVYDRLTLS VDNSTIDDNY EHYTYNGTYN NAADTHWDV YDMGTAITLD 180 

QEVDLSITNN SHVAGITLTQ GYEWEDIDDN TVSTGVNSSE VFNNTITVKD STVTSGSWTD 240 

EGTTGWFGHT GNASNYSNTL TADDVAIAAI ANPYADNAMQ TTVTLDNSTL MGDWFSSNF 3 00 

55 DENFFPQGAN SYRDADGDVD TNGWDGTDRM DVTLNNGSKW VGAAMSVHM\7 DEDGDGSYDG 3 60 

YAVGTEATAT LLDIAANSLW PSSTVGVDNI NTQYDENGHI VGNEVYQSGL FNVTLNGGSE 420 

WDTTKSSLID TLSINSGSQV NVADSRLISD TVSLTGGSNL NIGEDGHVAT NTLTIDNSTV 480 

KMSDDVSAGW GLEDAALYAN TITVTNDGLL DINVDQFDAN PFQADTLNLT STTDTNGNIH 540 

AGVFDIHSSD YVMDTDLVND RTNDTTKSNY GYGLIAMNSD GHLTINGNGD NDNTASIEAG 600 

60 QNEVDNNGDH VAAATGNYKV RIDNATGAGS lADYNGNELI YVNDKNSNAT FSAANKADLG 660 

AYTYQAEQRG NTWLQQMEL TDYANMALSI PSANTNIWNL EQDTVGTRLT NSRHGLADNG 720 

GAWVSYFGGN FNGDNGTINY DQDVNGIMVG VDTKIDGNNA KWIVGAAAGF AKGDMNDRSG 7 80 

QVDQDSQTAY lYSSAHFANN VFVDGSLSYS HFNNDLSATM SNGTYVDGST NSDAWGFGLK 840 

AGYDFKLGDA GYVTPYGSIS GLFQSGDDYQ LSNDMKVDGQ SYDSMRYELG VDAGYTFTYS 9 00 

65 EDQALTPYFK LAYVYDDSNN DNDVNGDSID NGTEGSAVRV GLGTQFSFTK NFSAYTDANY 9 60 

LGGGDVDQDW SANVGVKYTW 980 
<212> Type : PRT 
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<211> Length : 980 

SequenceName : SEQ ID 17 
SecjuenceDescription : 

Sequence 

<213> Organ! smName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MKLKHVGMIV VSVIxAMSSAA VSAAEGDESV TTTVNGGVIH FKGEWNAAC AIDSESMNQT 60 
VELGQVRSSR LAKAGDLSSA VGPNIKLNDC DTNVSSNAAV AFLGTTVTSN DDXnALQSSA 120 
AGSAQIOVGIQ ILDRTGEVLI LDGATFSAKT DLIDGTNILP FQARYIALGQ SVAGTANADA 180 
TFKVQYL 187 
<212> Type : PRT 
<21i> Length i 18,7 

SequenceName : SEQ ID 18 

SequenceDe script ion : 

Sequence 



<213> Organ istciName : Escherichia coli 0157:H7 
<400> PreSecjuenceString : 

MBCLLKVAAIA AIVFSGSALA GWPQYGGGG GNHGGGGNNS GPNSELNIYQ YGGGNSALAL 60 
QADARNSDLT ITQHGGGNGA DVGQGSDDSS IDLTQRGPGN SATLDQWNGK DSHMTVKQFG 120 
GGNGAAVDQT ASNSTV13VTQ VGFGNNATAH QY 152 
<212> Type : PRT 
<211> Length : 152 

SequenceName : SEQ ID 19 

SequenceDescription : 

Sequence 

<213> Organi stnName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MPIGNIiGHMP NVNNSIPPAP PLPSQTDGAG GRGQLINSTG PLGSRALFTP VRNSMADSGD 60 
NRASDVPGLP VNPMRIiAASE ITLNDGFEVL HDHGPLDTLN RQIGSSVFRV ETQBDGKHIA 120 
VGQRNGVETS WLSDQEYAR LQSIDPEGKD KFVFTGGR6G AGHAMVTVAS DITEARQRIL 180 
ELLEPKGTGE SKGAGESKGV GELRESNSGA ENTTETQTST STSSLRSDPK LWLALGTVAT 240 
GLiIGIiAATGI VQALALTPEP DSPTTTDPDA AASATETATR DQLTKEAPQN PDNQKVNIDE 300 
LGNAIPSGVL KDDWANIEE QAKAAGEEAK QQAIENNAQA QKKYDEQQAK RQEELKVSSG 3 60 

AGYGLSGALI L6GGIGVAVT AALHRKNQPV EQTTTTTTTT TTTSARTVEN KPANNTPAQG 420 
NVDTPGSEDT MESRRSSMAS TSSTFFDTSS IGTVQNPYAD VKTSLHDSQV PTSNSNTSVQ 480 
NMGNTDSWY STIQHPPRDT TDNGARLLGN PSAGIQSTYA RLALSGGLRH DMGGLTGGSN 540 
SAWTSNNPP APGSHRFV 558 
<212> Type : PRT 
<211> Length : 558 

SequenceName : SEQ ID 20 

SequenceDescription : 

Sequence 

<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MFSTFKKAAL LAAIALPFST MAAPTVTFQG EVTDQTCSVN INGQTNSWL MPTVAMADFG 60 
ATLADGQSAG QTPFTVSVSN CQAPTGADQA INTTPLGYDV DASTGVMGNR DTSSDAAKGF 120 
GIQLMDSSTS GNPVTLAGAT NVPGLTLKVG DTEASYDFGA RYPVIDSAAA TAGKITAVAE 180 
YTLSYL 186 
<212> Type : PRT 
<211> Length : 186 

SequenceName : SEQ ID 21 

SequenceDescription : 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MNSBGGKPGN VLTVNGNYT6 NNGLMTFNAT LGGDNSPTDK MNVKGDTQGN TRVRVDNIGG 60 
VGAQTVNGIE LIEVGGNSAG NFALTTGTVE AGAYVYTLAK GKGNDEKNWY LTSKWDGVTP 120 
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ADTPDPIKNP PWDPEGPSV YRPEAGSYIS KTIAAANSLFS HRLHDRLGEP QYTDSLHSQD 180 
SASSMWMRHV GOIERSSAGD GQIiNTQANRY VLQLGGDLAQ' WSSNAQDRWH LGVMAGYANQ 240 
HSNTQSNRVG YKSDGRISGY SAGLYATWYQ NDANKTGAYV DSWALYNWPD NSVSSDNRSA 300 
DDYDSRGVTA SVEGGYTFEA GTCSGSEGTL NTWYVQPQAQ ITWMGVKDSD HARKDGTRIE 360 
5 TEGDGNVQTR LGVKTYLNSH HQRDDGKQRE PQPYIEANWI NNSKVYAVKM NGQTVSRDGA 420 
RNIiGEVRTGV EAKVNNNLSL WGNVGVQLGD KGYSDTQGML GVKYSW 466 
<212> Type : PRT 
<211> Length : 466 

SequenceName : SEQ ID 22 
10 SequenceDescription : 

Sequence 



35 



50 



65 



'213> OrganismNatne : EschericUis. coli 0157:H7 . • 

15 <400> PreSequenceString : 

MSYLMLiRLYQ RNTQCLHIRK HRLAGFFVRL FVACAFAVQA PLSSAELYFN PRFIiADDPQA 60 

VADLSRFENG QEIiPPGTYRV DIYLNNGYMA TRDVTFNTGD SEQ6IVPCLT RAQLASMGLN 12 0 

TASVAGMNIiL ADDACVPLTT MVQDATAKLD VGQQRLNLTI PQAFMSNRAR GYIPPELWDP 180 

GINAGIiXiNYN FSGNSVQNRI GGNSHYAYLN LQSGLNIGAW RLRDNTTWSY NSSDRSSGSK 240 

20 NKWQHXNTWL ERDIIPLRSR LTLGDGYTQG DIFDGINFR6 AQLASDDNML PDSQRGFAPV 300 

IHGIARGTAQ VTIKQNGYDI YNSTVPPGPF TINDIYAAGN SGDLQVTIKE ADGSTQIFTV 360 

PYSSVPLLQR EGHTRYSITA GEYRSGNAQQ EKPRFFQSTL LHGLPAGWTI YGGTQLADRY 420 

RAPNFGIGKKT MGALGALSVD MTQANSTLPD DSQHDGQSVR FIiYNKSIiNES GTNIQLVGYR 480 

YSTSGYFUFA DTTYSRMNGY NIETQDGVIQ VKPKFTDYYN LAYNKRGKLQ LTVTQQLGRS 540 

25 STLYLSGSHQ TYWGTSNVDE QFQAGLNTAP EDINWTLSYS LTKNAWQKGR DQMLARNVKri 600 

PFSHWLRSDS KSQWRHASAS YSMSHDLNGR MTNIiAGVYGT LLEDNNLSYS VQTGYAGGGD 660 

GNSGSTGYAT LNYRGGYGNA NIGYSHSDDI KQLYYGVSGG VIAHANGVTIi GQPLNDTWL 720 

VKAPGAKDAK VENQTGVRTD WRGYAVLPYA TEYRENRVAIi DTNTIiADNVD LDNAVANWP 780 

TRGAIVRAEF KARVGIKLLM TLTHNNKPLP PGAMVTSESS QSSGIVADNG QVYLSGMPLA 840 

30 GKVQVKWGEE EMAHCVANYQ LPPESQQQLL TQLSAECR 878 
<212> Type : PRT 
<211> Length : 878 

SequenceName : SEQ ID 23 
SequenceDescription : 



Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 
40 MQIIFGEKCV SLLRLFFAAV LMLWCAQTAA YSGQCHTTQG NPYIGVNFGV KTLEEEENTT 60 
GWKDKFYQW NESNDYYVSC DCDKDNVRSG RWAFAADSPL VYLGDNWYKI NDYLAAKVLL 120 
QVKGSSPTAV PFENVGTGAD TRWHICDPGG QRLGGQGASG ISfSGSFSLKIL QPFVGSWIP 180 
PMALARLFEC YNIPAGDSCT TTGTPVLVYY LSGTINSLGS CSVNAGETIE VDLGDVFAAN 240 
FRWGHKPLG ARTAELAIPV RCNTGMAGLV NVNLSLTATT DPSYPQAIKT SRPGVGVWT 300 
45 DSQNNIISPA GGTLPLSIPD DADSIA 326 
<212> Type : PRT 
<211> Length : 326 

SequenceName : SEQ ID 24 
SequenceDescription : 



Sequence 



<213> OrganismName : Escherichia coli 0157 :H7 
<400> PreSequenceString : 

55 MKIKTLAIW LSALSLSSTA ALAAATTVNG GTVHFKGEW NAACAVDAGS VDQTVQLGQV 60 
RTASLAQDGA TSSAVGFNIQ LKDCDTNVAS KAAVAPLGTV IDAGHTNVLA LQSSAAGSAT 120 
NVGVQILDRT GAALTLDGAT PSEQTTLNNG TNTIPPQARY YAIGEATP6A ANADATFKVQ 180 
YQ 182 
<212> Type : PRT 

60 <211> Length : 182 

SequenceName : SEQ ID 25 
SequenceDescription : 



Sequence 

<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 
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MKLKVIATLI ATVAVGVSFN SNFASASTTS ASLTVNSNLT MGTCSAQIMD NSNKVINEW 60 
FGNVYISELG AKSKVQQFKI RFSNCSGLPQ NSAQIVLAPN GISCAGSQSS SA6FSNKFTD 120 
ASAATRTAVE VWTTDTPESN GSTQFHCAQK IPVPVTLPAD TTTQPYDYPL SARMTVAEGR 180 
IiVTDVRPGNF RSPTTFTITY Q 201 
5 <212> Type : PRT 
<211> Length : 201 

SequenceKTame : SEQ ID 26 

Sec[uenceDescriptioii : 

10 Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<4 00> PreSequenceString : 

£1ASTVEYGBT VDGWLEKDI QLVYGTAmiT KINPGGEQHI KEFGISSKTE IITGGYQYIEM 60 

15 KTGTAEYSVIiN DGYQIVQMGG AANQTTLNNG VLQVYGAAND PTIKGGRLIV EKDGITVLAA 120 

lEKGGLLEVK EGGLAIAVDQ KAGGAIKAST RVMEVFGTNR LGQFEIKNGI ANNMLLENGG 180 

SLRVEENDFA YNTTVDSGGL LEVMDGGTAT GVDKKAG6KL IVSTNABEVS GTNSKGQFSI 240 

KDGVSKNYEL DDGSGLIVME DTQAIDTIIiD EHATMQSLGK DTGTRVQANA VYDLGRSDQN 300 

GSITYSSKAI SEKMVINNGR ANVWAGTMVN VSVRGNDGIB EVMKPQINYA PAMLVGKWV 360 

20 SBGASLRTHG AVDTSKADVS IiENSAWTIIA DITTTNQNTR LNLANLAMSG ANVIMMDESV 420 

TRSSVTASAE NFTTLTTNTL SGNGNFYMRT DMANHQSDQL NVTGQATGDF KIFVTDTGAS 480 

PAAGDSLTLV TTGGGDAAFT LGNAGGWDI GTYEYTLLDN GNHSWSLAEIX RAQITPSTTD 540 

VIjNMAAAQPL VFDAELDTVR ERIiGSVKGVS YDTAMWSSAI NTRNNVTTDA GAGFEQTLTG 600 

LTLGIDSRFS REESSTIRGL FFGYSHSDIG FDRGGKCaSTVD SYTLGAYAGW EHQNGAYVDG 660 

25 WKVDRFANT IHGKMSNGAT AFGDYNSNGA GAHVESGFRW VDGLWSVRPY LAFTGFTTDG 720 

QDYTLSNGMR ADVGNTRIIiR AEAGTAVSYH MDLQNGTTLE PWLKAAVRQE YADSNQVKVN 780 

DDGKFNNDVA GTRGVYQAGI RSSFTPTLSG HLSVSYOTGA GVESPWNTQA GWWTF 836 

<212> Type : PRT 
30 <211> Length : 836 

SequenceName : SEQ ID 27 
SequenceDescription z 

Sequence 

35 

<213> OrganistriName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MQRKGNKLLI QLCSVTLLFF TTSWYALANE CYIERHAEGD YHMKISSTQL SLASQMVEVP 60 
TEIAEATWDV NIQLRGDAIG CKSLGDSKAV HFLNTADPSL ISTYTTTNGA ALLKTTVPGI 120 

40 VYSVELLCLS CGAADELDLW LPAQSGADNF IPSTQTKWAY EYSDQSWYLR FRLFITPEFK 180 
PKNGVSSGTT lAGKIASWYI GTNDQPWINF YIDNDSLKFF VDEPTCATVA LAQDQGNVSG 240 
HQVTLGNSYV SEVKKFGLTRE IPFSIRAEYC YASKITVKLK AANKPSDATL VGKTTGSASG 300 
VAVKVNSTYD NSKVLLKADG SNTVDYNFAA WSNNLLFLPF TAQLVPDGSG NAVGVGTFSG 360 
NATFSFTYE 369 

45 <212> Type : PRT 

<211> Length : 369 

SequenceName : SEQ ID 28 
SequenceDescription : 

50 Sequence 



<213> OrganistrtName : Escherichia coli 0157 :H7 
<400> PreSequenceString : 

MYQFTHQKSR IPKKTLLAAC CALFYSSNGA AADTVEYDSS PLMGTGASTI DVKRYAQGNP 60 

55 TPPGLYNVRV FVNGQATSSL EIPFVDIGEN SAAACLTHKN LAQLHIKQPE QPVTLLAREG 120 

EEEDCLDLAK SYEKADVCFD GSDQFLDLTI PQAYVLKSYG GYVDPSLWES GINAATLAYT 180 

LNAYHTSSDN DNSDSVYGAF NSGIIILGAWH FRT^GNYNWT TDNGSDPDFQ DRYLQRDIPA 240 

IRSQIIMGDA YTTGETFDSV NVRGVRLYSD SRMLPSALAS YAPTIRGVAKT SNAKVTVTQS 300 

GYKIYETTVP PGEFVIDDIS PSGFGSELW TIEEADGSKR TFTQPFSSW QMQRPGVGRW 360 

60 DFSAGKVIDD SLRSEPNMGQ ASYYYGL3!3NL FTGYTGIQFT DNNYLAGLLG VGINTSIGAF 420 

AVDVTHSRAE IPDDKTYQGQ SYRVTWNKLF QDTGTSFNLA AYRYSTQDYL GLHDALVLID 480 

DAKHLSADED KNTMQTYSRM KNQFTVSINQ PLNIAYEDYG SLFISGSWTY YWAANNSRTE 540 

YNVGYSKSVS WGSFSVNLQR SWNEDGEKDD AMYVSVSVPI ENILGGKRKS SGFRNLNTQL 600 

NTDFDGSHQL NVBTSSGNTEKT NLVNYSVNAG YSLDKNAGDL ASVGGYLNYE SGLGGISASA 660 

65 SATSDNSQQY SISTDGGFVL HSGGLTFTNN SFSSNDTLVL INAL6AKGAR INNSNNEIDR 720 

WGYAVTSSVS PYRBNRVGLN IBTLENDVEL KSTSATTVPR SGSWLTRFE TDE6RSAVLN 780 

ITAANGKSIP FAAEVYQGEV MIGSMGQGGQ AFVRGINDSG ELIVRWYENN QTIDCKLHYQ 840 
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FPAQPQTQGS TNTIiLIiITNLT CQVANH 866 
<212> Type : PRT 
<211> Length 866 

SequenceName : SEQ ID 29 

SequenceDescription : 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
10 <400> PreSequenceString : 

MKFKRIiLHSG lASLSLVACG VNAATDLGPA 6DIHFSITIT TKACEMEKSD LEVDMGTMTL 60 
QKPAAVGTVIi SKKOPTXELK ECDGISKATV EMDSQSDSDD DSMFALEAGG ATGVALKIED " 120 
DKGTQQVPKG SSGTPIEWAI DGETTSIiHYQ ASYVWNTQA TGGTANALVN FSITYE 176 

15 <212> Type : PRT 

<211> Length : 17S 

SectuenceName : SEQ ID 3 0 
SequenceDescription : 

20 Sequence 



<213> Organismlsram& : Escherichia coli 0157 :H7 
<4 00> PreSequenceString : 

MKYNNIIFLG LCLGLT17YSA LSADSVIKIS GRVLDYGCTV SSDSLNFTVD LQKNSARQFP 60 
25 TTGSTSPAVP FQITLSECSK GTTGVRVAFN GIEDAENNTL LKLDEGSNTA SGLGIEILDG 120 
NMRPVKLNDL HAGMQWXPLV PEQNNILPYS ARLKSTQKSV NPGLVRASAT FTLEFQ 176 

<212> Type : PRT 
<211> Length : 176" 
30 SequenceITain& : SEQ ID 31 

SequenceDescrr-iption : 

Sequence 

35 <213> OrganistnNam& : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MKWRKRGYLL AAILALi^SAT IQAADVTITV NGKWAKPCT VSTTNATVDL 6DLYSFSLMS 60 
AGAASAWHDV ALELTNCX>VG TSRVTASFSG AADSTGYYKN QGTAQNIQLE LQDDSC^TLN 120 
TGATKTVQVD DSSQSAHFPL QVRALTVNGG ATQGTIQAVI SITYTYS 167 
40 <212> Type : PRT 

<211> Length : 167 

SequenceName : SEQ ID 32 
SequenceDescx-iption : 

45 Secpience 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MKRAPLITGL LLISTSCLAYA SSEGCGADST SGATNYSSW DDVTVNQTDN VTGREFTSAT 60 
50 LSSTNWQYAC SCSAGKA."VKXj VYMVSPVLTT TGHQTGYYKL NDSLDIKTMN RPGNPGD 117 

<212> Type : PRT 
<211> Length : 117 

SequenceName : SEQ ID 33 
55 SequenceDescaription : 

Sec[uence 

<213> OrganismName : Escherichia coli 0157:H7 

60 <400> PreSequenceString : 

MKKALLAAAL VMASGSALAV DGGHIDFNGM VQSGTCKVGV VDTGMHSVTT DGWTLDTAN 60 
VTDTFAEVSA TAVGLLPKEF MISVECDPGA PKNAELTMGS ASYANTSGTL NNNMNITVNG 120 
IAPAQNVNIA VHNMKNKAGA AEIKQVHMNN SSEVQELTLD AEGKGQYVFN ASYVKAPNSP 180 
AVTAGHVTTN ALYTVAYK 198 

65 <212> Type : PRT 

<211> Length : 198 

SequenceName : SEQ ID 34 
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SequenceDescription : 
Sequence 

5 <213> OrganismKrame : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MKPNMIVGAL ALTSVFMAGH LQAADGTVHF RGEIIDSTCE VTPETKDQW DLGKVNRTAP 60 
SGVDDVAAPT APSIDLTQCP ETFKSAAIRF DGNEDAHGNG NIAIGTPLDN SNDAAAGISP 120 
SDNSGDYTGA GAVSAAKGVA IRLYNRADNT QVKLYENSAS TPISN6NASM KFMARYIATE 180 
10 TTIDPGTANA DSQFTVEYIK 200 
<212> Type : PRT 
<211> Length : 200 

SeguenceNarae : SEQ ID 35 

SequsnceDescription . 

15 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<4 00> PreSequenceString : 

20 MPIFQREGHL KYSFAAGEYQ AGNYDSASPR FGQLDLIYGL PWGMTAYGGV LISNNYNAFT 60 
LGIGKNFGYI GAISIDVTQA KSELNNDRDS QGQSYRFLYS KSFESGTDFR LAGYRYSTSG 120 
FYTFQEATDV RSDADSDYNR YHKRSEIQGN LTQQLGAYGS VYLNLTQQDY WNDAGKQNTV 18 0 

SAGYNGRIGK VSYSIAYSWN KSPEWDESDR LWSFNISVPL GRAWSNYRVT TDQDGRTNQQ 240 
VGVSGTLLED RNLSYSVQEG YASNGVGNSG NANVGYQGGS GNVNVGYSYG KDYRQLMYSV 300 

25 RGGVIVHSEG VTLSQPLGET MTLISVPGAR NARWNNGGV QVDWMGNAIV PYAMPYRENE 360 
ISLRSDSLGD DVDVENAFQK WPTRGAIVR ARFDTRVGYR VLMTLLRSAG SPVPFGATAT 420 
LITDKQNEVS SIVGEEGQIiY ISGMPEEGRV LIKWGNDASQ QCVAPYKLSL ELKQGGIIPV . 480 
SANCQ 485 
<212> Type : PRT 

30 <211> Length : 485 

SequenceHame : SEQ ID 36 
SequenceDescription : 



Sequence 
35 

<213> Organ isralsraTne : Escherichia coli 0157:H7 
<4 00> PreSeq[uenceString : 

MSGYTVKPPT GDSNEQTQFI DYFNLFYSKR DQEQISISQQ LGNYGATFFS ASRQSYWNTS 60 
RSDQQISFGL NVPFGDITTS LNYSYSNNIW QNDRDHLLiAF TLNVPFSHWM RTDSQSAFRN 12 0 

40 SNASYSMSND LKGGMTNLSG VYGTLLPDNN LNYSVQVGNT HGGNTSSGTS GYSTLNYRGA 180 
YGNTNVGYSR SGDSSQIYYG MSGGIIAHAD GITFGQPLGD TMVLVKAPGA DNVKIENQTG 240 
IHTDWRGYAI LPFATEYREN RVALNANSLA DNVELDETW TVIPTHGAIA RATFNAQIGG 3 00 

KVLMTLKYGM KSVPFGAIVT HGENKNGSIV AENGQVYIiTG LPQSGKLQVS WGNDKNSNCI 3 60 

VDYKLPEVSP GTLLNQQTAI CR . 382 

45 <212> Type : PRT 

<211> Length : 382 

Secfuencellarae : SEQ ID 37 . 
SequenceDescription : 



50 Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<4 00> PreSequenceString : 

MSALYERSQL TQVMISSAPA TAETMEKAEY LRLDCTIKEV QFTAGQKQDI DVTTLCSTEQ 60 

55 ENINGLGASS EISMSGNFYL NQAQNALRDA YDNDTVYAFK VQFPSGKGFK FLAEVRQHTW 120 

SSGTNGWAA TFSLRLKGKP VSYWPLAFV KNLDKTLTVN TGALLTMSVS VNGGTPPYKH 180 

AWKKDGQPVE GQTTDTPSKP GAQSGDKGAY TCEVTDSAEQ PQSITSDACT VTVNGAGG 238 

<212> Type : PRT 
60 <211> Length : 238 

SequenceName : SEQ ID 38 
SequenceDescription : 



Sequence 

65 

<213> OrganistnNarae : Escherichia coli 0157:H7 
<400> PreSequenceString : 
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MRNKPFYIiLC AFLWIAVSH2V XiAADSTITIR 
NIGATTPWP FRILLSPCGNT A.VSAVKVGFT 
QQNQIPLNAP SSAISWTTLT PGKPNTLNFY 



GYVRDNGCSV AAESTNFTVD LMENAAKQFN 
GVADSHNANL IiALENTVSAA SGLGIQLLNE 
ARLMATQVPV TAGHINATAT FTLEYQ 



60 
120 
176 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



<212> Type : PRT 

<211> Length : 176 

SequenceName : SEQ ID 39 
SequenceDescript3-on : 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MNKSWSISA AMtiVLLCQPV MGSEISPATP -SDEDNYTFOP QLFRGSRFSQ SSIiAKLlTRE 60 

SVAPGNYKMD lYTNNKLSGS WNVTFKEAAD GRVLPCLTPE VADAIGLKTG EDKGEKDPVC 120 

TFAKEIiAPGI TSQTQLSQLR LDLSVPQSQL ISRPR6YVPP SELDTGASLA FMNYIANYYN 180 

VAYSGQNAHS QRSLWASFNG GINLGAWQYR QLSNMTWDND KGNQWNNIRS YLQRPIjPAIN 240 

SQLMMGQLIT SGRFFSGLSY HGVSLATDER MLPDSMRGYA PTIRGVAATIT ARVSVMQNGH 3 00 

EIYQTTVAPG PFEINDLYPT SYSGDIiDVTV TEANGAVSRF SVPFSAVPES MRPCTSRYIsTV 3 60 

EVGKTQDSGD DSMFGDLTWQ HGMTNTLTFN SGSRIADGYQ ALMLGGVYGS SLGAFGANLT 42 0 

WSHARVPESE AQSGWMSQLT WSKTFQPTST TVSLAGYRYS TSGYRDLADV LGERHAASNK 480 

QSWDSSQWRQ QSRFDLTLSQ SIiANYGNLFV SGSTQNYRGG KSRDTQLQLG YSMSFSHGIS 540 

MNLSVGRQRM GGYKDNSDDM QTVTSLSFSF PLGGNGPRVP SLSNSWTHST DGSSQLQSSL 600 

TGMLDEAQTT NYSLNVMRDQ QYKQTTLSGN MQKRFSQTTV GLNASKGQDY WQASGNVQGA 660 

MAVHGGGITP GPYLGETFAL VEAKGAEGAK VYNSSQLEIN DSGYALVPAV TPYRYNRISL 720 

DPQGMDGDAE LVDSERQVAP VAGAAVKVIF RTRPGKALLI KSRMADGSEL PMGADVIiDEN 780 

NTWGIAQQG GQIYLRTEQT KGHLSVRWGE GANDSCQLPF DISGKDSNSP IIRLNETCQS 840 

<212> Type : PRT 
<211> Length : 840 

SequenceMame : SEQ ID 40 

Sec[uenceDe script X on : 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString- : 

MKLAACFLTL LPGFAVAASW XSPGFPAFSE QGTGTFVSHA QLPKGTRPLT LNFDQQCWQP 60 
ADAIKLNQML SLQPCSNTPP QWRLFRDGKY TLQIDTRSGT PTLMISIQNA AEPVANLVRE 12 0 

CPKWDGLPLT LDVSATFPEG AAVRDYYSQQ lAIVKNGQIT LQPAATSNGL LLLERAETDA 180 

SAPFDWHNAT VYFVLTDRFE KTGDPSNDQSY GRHKDGMAEI GTFHGGDLRG LTNKLDYLQQ 24 0 

LGVNALWISA PFEQIHGWVG GGTKGDFPHY AYHGYYTQDW TNLDANMGNE ADLRTLVDSA 3 00 

HQRGIRILFD WMNHTGYAT LADMQEYQFG ALYLSGDEVK KTLGERWSDW KPAAGQTWHS 3 60 

FNDYINFSDK TGWDKWWGKN WIRTDIGDYD NPGFDDLTMS LAFLPDIKTE STTASGLPVF 420 

YKNKTDTHAK AIDGFTPRDY LTHWLSQWVR DYGIDGFRVD TAKHVELPAW QQLKTEASAA 480 

LREWKKANPD KALDDKPFWM XGEAWGHGVM QSDYYRHGFD AMINFDYQEQ AAKAVDCIAQ 540 

MDTTWQQMAE KLQGFNVLSY LSSHDTRLFR EGGDKAAELL LLAPGAVQIF YGDESSRPFG 600 

PTGSDPLQGT RSDMNWQDVS GKSAANVAHW QKISQFRARH PAIGAGKQTT LSLKQGYGFV 660 
REHGDDKVLV IWAGQQ 676 
<212> Type : PRT 
<211> Length : 676 

SequenceName : SEQ ID 41 

SequenceDe script ion : 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSecfuenceString : 

MPQRHHQGHK RTPKQLALII KRCLPMVLTG SGMLCTTANA EEYYFDPIML ETTKSGMQTT 60 

DLSRPSKKYA QLPGTYQVDI WLNKKKVSQK KITFTANAEQ LLQPQFTVEQ LRELGIKVDE 120 

IPALAEKDDD SVINSLEQII E>GTAAEFDFN HQRLNLSIPQ lALYRDARGY VSPSRWDDGI 180 

PTLFTNYSFT GSDNRYRQ6N RSQRQYLNMQ NGANFGPWRL RNYSTWTRND QASSWNTISS 240 

YLQRDIKALK SQLLLGESAT SGSIFSSYNF TGVQLASDDH MLPNSQRGFA PTVRGIANSS 300 

AIVTIRQNGY VIYQSNVPAG AFEINDLYPS SNSGDLEVTI EESDGTQRRF IQPYSSLPMM 3 60 

QRPGHLKYSA TAGRYRADAN SDSKEPEFAE ATAIYGLNNT FTLYGGLLGS EDYYALGIGI 420 

GGTLGALGAL SMDINRADTQ FDNQHSFHGY QWRTQYIKDI PETNTNIAVS YYRYTNDGYP 480 

SFDEANTRNW DYNSRQKSEI QFNISQTIFD GVSLYASGSQ QDYWGNNEKN RNISVGVSGQ 540 
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QWGIGYSLNY QYSRYTDQKnsi DRALSLNLSI PLERWLPRSR VSYQMTSQKD RPTQHEMRLD 600 

GSLLDDGRLS YSIiEQSLDDD NNHNSSVNAS YRSPY6TFSA GYSYGNDSSQ YNYGVTGGW 660 

IHPHGVTLSQ YLGNAFALXD ANGASGVRIQ ISTYPGIATDPF GYAWPYLTT YQENRLSVDT 720 

TQLPDNVDLE QTTQFWPKTR GAMVAARPNA NIGYRVLVTV SDRNGKPLPF GAIjASNDDTG 780 

5 QQSIVDEGGI LYLSGISSKS QSWTVRWGNQ ADQQCQFAFS TPDSEPTTSV LQGTAQCH 83 8 

<212> Type : PRT 
<211> Length : 838 

SequenceName r SEQ ID 42 
10 SequenceDescrxption : 

Sequence 

<cr.l3 5- OrganismKfame z Escherichia coll 0157sH7 

15 <400'> PreSequenceStaring : 

MMFRNRILLI FILWANFTWA GCRTTASLNI TDGINVGEIL AITETSFSKSV VFTGISCDTS 60 
TDKIVYKNIQ SDWVEVGPFG KTGEKLKVKIE SLGKTSDTIG KSSNAQAVLP YWKIARGTP 120 
DFTGERKSTW FISDTVIAKTI GGESSSSIDF WLGICKALKF NWCVNYLTSK LAGDTFTLGL ISO 
NISYYPKNTT CKPENTVIKV DDIAIiFQLRN QGKIAANSKE GTITLKCDNL FGDKKQASRN 240 

20 MWYLSSSDL VKGSNTILRG KTDKTGVGFVL DLTEPPKGTE AAIKISANGD QGAATSLWKT 300 
DKPGVSLNSN IINIPVMASY YVYDEKKVKS GALEATALIN VKYD 344 
<212> Type : PRT 
<211> Length : 344 

SequenceName : SEQ ID 43 

25 SequenceDescrxption : 

Sequence 



60 



<213> OrganisraName r Escherichia coli 0157:H7 

30 <400> Pre Sequences taring : 

MIKKASLLTA CSVTAFSAWA QDTSPDTLW TANRFEQPRS TVLAPTTWT RQDIDRWQST 6 0 

SVNDVLRRLP GVDITQNGGS GQLSSIFIRG TNASHVLVLI DGVRLNLAGG SGSADLSQFP 12 0 

lALVQRVEYI RGPRSAVYGS DAIGGWNII TTRDEPGTEI SAGWGSNSYQ NYDVSTQQQL 180 
GDKTRVTLLG DYAHTHGYDV VAYGNTGTQA QPDNDGFLSK TLYGALEHNF TDAWSGFVRG 240 

35 YGYDNRTNYD AYYSPGSPLV DTRKLYSQSW DAGLRYNGEL IKSQLITSYS HSKDYiTYDPH 3 00 

YGRYDSSATL DEMKQYTVQW ANNIIIGHGN VGAGVDWQKQ STAPGTAYVK DGYDQRNTGI 360 
YLTGLQQVGD FTFEGAARSD DNSQFGRHGT WQTSAGWEFI EGYRFIASYG TSYKAPNLGQ 420 
LYGFYGNPNIi DPEKSKQWEG AFEGLTAGVN WRISGYRNDV SDLIDYDDHT LKYYNEGKAR 48 0 

IKGVEATANF DTGPLTHTVS YDYVDARNAI TDTPLLRRAK QQVKYQLDWQ LYDFDWGITY 540 

40 QYLGTRYDKD YSSYPYQTVK MGGVSLWDLA VAYPVTSHLT VRGKIANLFD KDYETVYGYQ 60 0 

TAGREYTLSG SYTF 614 
<212> Type : PRT 
<211> Length : 614 

SequenceName = SEQ ID 44 

45 SequenceDescrxption : 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
50 <400> PreSequenceStr-ing : 

MKNKLLFMML TILGAPGIAA AAGYDLANSE YNFAVNELSK SSFNQAAIIG QAGTNNSAQL 60 
RQGGSKLLAV VAQEGSSNRA KIDQTGDYNL AYIDQAGSAUT DASISQGAYG NTAMIIQKGS 120 
GNKANITQYG TQKTAIWQR QSQMAIRVTQ R 151 
<212> Type : PRT 
55 <211> Length : 151 

SequenceName = SEQ ID 45 
SequenceDescrxption : 



Sequence 



<213> OrganisraName = Escherichia coli 0157:H7 
<400> PreSequenceString : 

MNIFAYLLVL VFSMSMSSSA FASWMTGTR IIFPGDAKEK TIQLRNTSDQ PYIINIHVBD 60 

ERGSDKNVPP MPTPQTFRME AAAGQALRLL YTGNNLPQDR ESVPWPSPSQ LPYLNKNDKS 120 

65 QNQLILALTN RVKXFYRPSS IVGKSSI3APK NLTYQVKQNR lEVTNPTGYY VTIRAAELLN 180 

NGKKVPLANS VMIAPQSTTE WTLPSGISVA PGAQIHLVTV NDYGVNVTSE HAL 233 
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<212> Type : PRT 

<211> Length : 233 

SequenceName : SEQ ID 46 
SequenceDe script ion : 

Sequence 



<213> OrganismName : Escherichia coli 0157 :H7 
<40 0> PreSequenceString : 

MKRLHICRFLIi ATFCALLTAT LQAADVTITV NGRWAKPCT IQTKEANVNL GDLYTRNLQQ 60 
PGSASGWHNI TLSLTDCPAE TSAVTAIVTG STDNTGYYKN EGTAENIQIE LRDDQDATLK 12 0 

NGDSKTVIVD EITRNAQFPL KARAITVNGN ASQGTIEALI NVIYTWQ 167 
<212> Type : PRT 

<.211> 'Length 167 • ' . . 

SequenceName : SEQ ID 47 
SequenceDescription : 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<4 0 0> PreSequenceString : 

MRAKLLGIVL TTPIAISSFA STETLSFTPD NINADISLGT LSGKTKERVY LAEEGGRKVS 60 
QLDWKFNNAA IIKGAINWDL MPQISIGAAG WTTLGSRGGN MVDQDWMDSS NPGTWTDESR 12 0 

HPDTQLNYAN EFDLNIKGWL LNEPNYRLGL MAGYQESRYS FTARGGSYIY SSEEGFRDDI 18 0 

GSFPNGERAI GYKQRFKMPY IGLTGSYRYE DFELGGTFKY SGWVEASDND EHYDPGKRIT 240 
YRSKVKDQNY YSVSVNAGYY VTPNAKVYVE GTWNRVTNKK GNTSLYDHND NTSDYSKNGA 3 00 

GIENYNFITT AGLKYTF 317 
<212> Type : PRT 
<211> Length : 317 

SequenceName : SEQ ID 48 

SecpienceDescription : 

Sequence 



<213> OrganismName : Escherichia coli 0157 :H7 
<40 0> PreSequenceString : 

MFFKRGKILS AGRLNKKSLG IVMLLSVGLL LAGCSGSKSS DTGTYSGSVY TVKRGDTLYR 60 
ISRTTGTSVK ELARLNGXSP PYTIEVGQKL KLGGAKSSSS TRKSTAKSTT KTASVTPSSA 12 0 

VPKSSWPPVG QRCWLWPTTG KVIMPYSTAD GGNKGIDISA PRGTPIYAAG AGKWYVGNQ 180 
LRGYGNLIMI KHSEDYITAY AHNDTMLVNN GQSVKAGQKI ATMGSTDAAS VRIiHFQIRYR 240 
ATAIDPLRYL PPQGSKPKC 259 
<212> Type : PRT 
<211> Length : 259 

SequenceName : SEQ ID 49 

SecjuenceDescription : 

Sequence 



<213> OrganismName : Escherichia coli 0157 :H7 
<400> PreSequenceString : 

MPTPNPLAPV KGAGTTLWVY NGNGDPYANP LSDNDWSRLA KVKDLTPGEL TAESYDDSYL 60 
DDEDADWAAT GQGQKSAGDT SFTLAWMPGE QGQQALLAWF NEGDTRAYKI RFPNGTVDVF 120 
RGWVSSIGKA VTAKEVITRT VKVTNVGRPS MAEDRSTVTA ATGMTVTPAS TSWKGQSTT 180 
LTVAFQPEGA TDKSFRAVSA DKTKATVSVS GMTITVKGVA AGKVNIPWS GNGEFAAVAE 240 
INVTAS 246 
<212> Type : PRT 
<211> Length : 246 

SequenceName : SEQ ID 50 

SequenceDescription : 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MSALYERSQL TQVMISSAPA TAETMDKAEY LRLDCTIKEV QFTAGQKQDI DVTTLCSTEQ 60 
ENINGLGASS EISMSGNFYL NQAQNALRDA YDNDALYAFK VLFPSGKGFK FLAEVRQHTW 120 
SSGTNGWAA TFSLRLKGKP VSFWPIiAPV KNLDKTLTVN TGALLTMSVS ANGGTPPYKY 180 
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AWKKDGQPVD GQTTDTFSKP GAQSADAGKY TCWTDSAEK AQSVTSVECT VTVSAAAG 23 8 

<212> Type : PRT 
<211> Length : 238 

SequenceName : SEQ ID 51 

SequenceDescription : 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MKKSTLALW MGIVASASVQ AAEIYNKDGN KLDVYGKVKA MHYMSDNDSK DGDQSYIRFG 60 

FKGETQINDQ LTGYGRWEAE FAGNKAESDT AQQKTRLAFA GLKYKDLGSF DYGRNLGALY 12 0 

DVEZ^WTDMFP EFGGDSSAQT DNFMTKRASG Li^TYRJSlTDFF GVIDGIiNLTIi QYQGIOSfSNRD .18 0, 

VKKQNGDGFG TSLTYDFGGS DFAISGAYTN SDRTNEQNIiQ SRGTGKRAEA WATGLKYDAN 240 

NIYLATFYSE TRKMTPITGG FANKTQNFEA VAQYQFDFGL RPSLGYVLSK GKDIEGIGDE 3 00 

DLVNYIDVGA TYYFNKNMSA FVDYKINQLD SDNKLNIMND DIVAVGMTYQ F 351 



<212> Type : PRT 

<211> Length : 351 

SequenceName : SEQ ID 52 
SequenceDescription : 

Sequence 



<213> OrganistnName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MRVKHAWLL MLISPLSWAG TMTFQFRNPN FGGNPNNGAF LLNSAQAQNS YKDPSYNDDF 60 
GIETPSALDN FTQAIQSQIL GGLLSNINTG KJPGRMVTNDY ' IVDIANRDGQ LQLNVTDRKT 12 0 

GQTSTIQVSG LQNITSTDF 138 
<212> Type : PRT 
<211> Length : 138 

SequenceName : SEQ ID 53 

Sec[uenceDescription : 

Sequence 



<213> OrganisraName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MKRKVLAMLV PALLVAGAAN AAEIYNKDGN KLDLYGKVAG LHYFSDDASS DGDMSYARIG 60 
FKGETQIADQ FTGYGQWEFN IGANGPESDK G3SITATRLAFA GFGFGQNGTF DYGRNYGWY 120 
DVEAWTDMLP EFGGDTYAGA DNFMNGRANS VATYRNNGFF GQVDGLNFAL QYQGNNEKSG 180 
LFDQEGSGNG NGRKLAKENG DGSVCPLPMT LTLV 214 
<212> Type : PRT 
<211> Length : 214 

SequenceName : SEQ ID 54 

SequenceDescription : 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MNTVTLEGGT FNNISTGTLNDV VKIEKNSMAV XISTNTGSLSTL QLHDGTVNNS GIASARVNAQ 60 

GDAVFNNLAG GEARKGAILY NSAWNNAGT WKMGYQDENN NAGTLDIDDK STFNNSGKLI 12 0 

LDNSKNAIRF QGSNANATLY NTGEMTLDAA LiGAGAILYDD GASEFINKGV VDAKVTVAVS 180 

TAGATESDAF LWNQDGGVIN FDKDNASAVK FTHKNYVALN DGVMNISGNN AVAMEGDKNA 240 

QLVNNGVINL GTEGTTDTGL TGMQLDANAT ADAVIENNGT INIFANDSFA FSVLGTEGHI 3 00 

VNNGTWIAD GVTGSGLIKQ GDSVNVEGVN GISTSGNNTEVH YTDYTLPDMP NTYTTSPFSE 3 60 

TTDSGSSDGS SNNLNGYIVG TNVDGSAGKL KVNNASMNGV GINTGFAAGT ADTTVSFDNV 42 0 

VEGINLTDAD AITSTSWWT AKGSTDASGN VDVIMSKNAY TDVATDASVN DVAKALDAGY 480 

TNNELYTSLN VGTTAELNSA LKQVSGSQAT TVFREARVLS NRFSMLADAA PKVGNGLAFN 540 

WAKGDPRAE LGNNTEYDML ALRKTVDLSE SQSMSLEYGI ARLDGDGAQK AGDNGVTGGY 600 

SQFFGLKHQM SFDNGMRWNN ALRYDVHNLD SSRSVAYGDV SKTADTDVKQ QYLELRSEGA 660 

KTFEPREGLK ITPYAGVKLR HSLEGGYQER KFAGDFNLSMN SGSETAVDSI VGLKLDYAGK 720 

GGWSANATLE GGPNLSYSKS QRTASLAGAG SQHFNVDDGQ KCGGIISTSLAS VGVKYSSKES 780 

SLNLDAYHWK EDGISDKGVM LNFKKTF 807 
<212> Type : PRT 
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<211> Length : 807 

SequenceName : SEQ ID 55 
SequenceDescription : 

5 ' Sequence 

<213> OrganismName : Escherichia 
<400> Pre Sequences t ring : 
MLNGISNAAS TLGRQLVGIA SRVSSAGGTG 

10 IFNVSSQVTS FTPSRPAPPP PTSGQASGAS 
ARQAPPPPTS GQASGASRPL PPIAQALKEH 
SGASRPLPPI AQALKEHIiAA YEKSKGPEAL 
AYEQSKKG 
<212> Type : PRT 

15 <211> Length : 248 

SequenceName : SEQ ID 56 
SequenceDescription = 



coli 0157:H7 

FSVAPQAVRL TPVKVHSPFS PGSSNVNART 60 
RPLPPIAQAL KEHIiAAYEKS KGPEALGFKP 120 
LAAYEKSKGP EALGFKPARQ APPPPTSGQA 180 
GFKPARQAPP PPTGPSGLPP LAQALKDHLA 240 

248 



20 



Sequence 



<213> OrganismName : Escherichia coli 0157:H7 

<400> PreSequenceString : 

MNKKIHSLAL LWLGIYGVA QAQEPTDTPV SHDDTIWTA AEQNLQAPGV STITADEIRK 60 

NPVARDVSEI IRTMPGVNLT GNSTSGQRGN NRQIDIRGMG PENTLILIDG KPVSSRKTSVR 120 

25 QGWRGERDTR GDTSWVPPEM lERIEVLRGP AAARYGNGAA GGWNIITKK GSGEWHGSWD 180 

AYFNAPEHKE EGATKRTNFS LTGPLGDEFS FRLYGNLDKT QADAWDINQG HQSARAGTYA 240 

TTLPAGREGV INKDINGWR WDFAPLQSLE LEAGYSRQGN LYAGDTQNTN SDAYTRSKYG 3 00 

DETNRLYRQN YSLTWNGGWD NGVTTSNWVQ YEHTRNSRIP EGLAGGTEGK FNEKATQDFV 3 60 

DNDLDDVMLH SEVNLPIDFL VNQTLTLGTE WNQQRMKDLS SNTQALTGTN TGGAIDGVSA 420 

30 TDRSPYSKAE IFSLFAENNM ELTDSTIVTP GLRFDHHSIV GNNWSPALNI SQGLGDDFTL 480 

KMGIARAYKA PSLYQTNPNY ILYSKGQGCY ASAGGCYLQG NDDLKAETSI NKEIGLEFKR 540 

DGWLAGITWF RNDYRNKIEA GYVAVGQNAV GTDLYQWDNV PKAWEGLEG SLNVPVSETV 600 

MWTNNITYML KBENKTTGDR LSIXPEYTLN STLSWQARED LSMQTTFTWY GKQQPKKYNY 660 

KGQPAVGPET KEISPYSIVG LSATWDVTKN VSLTGGVDNL FDKRLWRAGN AQTTGDLAGA 720 

35 NYXAGAGAYT YNEPGRTWYM SVNTHF 746 
<212> Type t PRT 
<211> Length j 746 

SequenceName : SEQ ID 57 
SequenceDescription = 



40 



Sequence 



<213> OrganismName : Escherichia coli 0157:H7 

<400> PreSequenceString : 
45 MGGRFSLRYK KLSYRFVFLT LAGCSSVGNQ SLKNETQESV KTKIVKGKTT KQDVLASFGE 60 

PDSRSLIDGE EQWSYTMYNS QSKATSFIPV VGLIiAGGADS QTKSLTVSFK GEKVSTYIFN 12 0 

AGTSNVKTGI F 131 

<212> Type : PRT 

<211> Length : 131 
50 SequenceName : SEQ ID 58 

SequenceDescription : 



Sequence 



55 <213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MKKIACLSAL AAVLAFTAGT SVAATSTVTG GYAQSDAQGQ MNKMGGFNLK YRYEEDNSPL 60 
GVIGSFTYTE KSRTASSGDY NKNQYYGITA GPAYRINDWA SIYGWGVGY GKFQTTEYPT 120 
YKHDTSDYGF SYGAGLQFNP MENVALDFSY BQSRIRSVDV GTWIAGVGYR F 171 

60 

<212> Type : PRT 
<211> Length : 171 

SequenceName : SEQ ID 59 

SequenceDescription : 

65 

Sequence 
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<213> OrganisinName : Escherichia coli 0157 :H7 
<400> PreSequenceString : 

MKSIATLWC AISGIACVNIi SAHAAEGEHT ISLGYAHFQF PGLKDFVKDA TAHNRETFSH 60 
FVNRNYFSSL 6EYTDGRVSG YEGKDKNPQG INIRYRYEIT D3DFGVITSFT WTRSLTNSQT 120 
5 FIDVQSADHT RKIKNPAASA RTDIRANYWS LLAGPSWRVN QYMSLYAMAG MGVAKVSADL 180 
KIKDNINSSG GFSESNSTKK TSLAWAAGAQ FNLNESVTLD VA.YEGSGSGD WRTSGVTAGI 240 
GLKP 244 
<212> Type : PRT 
<211> Length : 244 
10 SequenceName : SEQ ID 60 

SequenceDescription : 

Sequence 



25 



40 



15 <213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MRKLYAAILS AAICLTVSGA PAWASEQQAT LSAGYLHVST KTAPGSDMLNG INVKYRYEFT 60 
DTLGLVTSPS YAGDRNRQIT RYSDTRWHED SVRNRWFSVM A.GPSVRVNEW FSAYAMAGVA 120 
YSRVSTFSGD YIiRVTDNKGK THDVLTGSDD GRHSNTSIAW GAGVQFNPTE SVAIDIAYEG 180 
20 SGSGDWRTD6 FIVGVGYKP 199 
<212> Type : PRT 
<211> Length : 199 

SequenceName : SEQ ID 61 
SecjuenceDescription : 



Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

30 MRKLYAAILS AAICLAVSGA PAWASEQQAT LSAGYLHART SAPGSDNLNG INVKYRYEFT 60 
DTLGLVTSFS YAGDKNRQLT RYSDTRWHED SVRNRWFSVM A.GPSVRVNEW FSAYAMAGVA 12 0 

YSRVSTFSGD YLRVTDNKGK THDVLTGSDD GRHSNTSLAW SAGVQFNPTE SVAIDIAYEG 180 
SGSGDWRTDG FIVGVGYKF 3.99 
<212> Type : PRT 

35 <211> Length : 199 

SequenceName : SEQ ID 62 
SequenceDescription : 



Sequence 



<213> OrganismName : Escherichia coli 0157:HV 
<400> PreSequenceString : 

MRKLYAAILS AAICLAVSGA PAWASEQQAT LSAGYLHART SAPGSDNLNG INVKYRYEFT 60 

DTLGLVTSFS YAGDKNRQLT RYSDTRWHED SVRNRWFSVM AGPSVRVNEW FSAYAMAGVA 12 0 

45 YSRVSTFSGD YLRVTDNKGK THDVLTGSDD GRHSNTSLAW CSAGVQFNPTE SVAIDIAYEG 180 

SGSGDWRTDG FIVGVGYKF 199 

<212> Type : PRT 

<211> Length : 199 

SequenceName : SEQ ID 63 
50 SequenceDescription : 

Sequence 



<213> OrganismName : Escherichia coli 0157:H'7 

55 <400> PreSequenceString : 

MRKLYAAILS AAICLAVSGA PAWASEQQAT LSAGYLHART SAPGSDNLNG INVKYRYEFT 60 
DTLGLVTSFS YAGDKNRQLT RYSDTRWHED SVRNRWFSVM JVGPSVRVNEW FSAYAMAGVA 12 0 

YSRVSTFSGD YLRVTDNKGK THDVLTGSDD GRHSNTSLAW GAGVQFNPTE SVAIDIAYEG 180 
SGSGDWRTDG FIVGVGYKP 3.99 

60 <212> Type : PRT 

<211> Length : 199 

SequenceName : SEQ ID 64 
SequenceDescription : 

65 Sequence 

<213> OrganismName : Escherichia coli OlBTrHV 
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<400> PreSequenceString : 

MVMSQKTLFT KSALAVAVAL ISTQAWSAGF QLNBFSSS6L GRAYSGEGAI ADDAGNVSRN 60 

PALITMPDRP TFSAGAVYID PDVNISGTSP SGRSLKADNI APTAWVPNMH FVAPINDQFG 120 

WGASITSNYG LATEFNDTYA GGSVGGTTDL ETMNLNBSGA YRLNNAWSFG LGFNAVYARA 180 

5 KIERFAGDLG QLVAGQIMQS PAGKTPQGQA IiAATANGIDS NTKIAHLNGN QWGFGWNAGI 240 

LYELDKNNRY ADTYRSEVKI DFKGNYSSDIi NRVFKTNYGIiP IPTATGGATQ SGYLTLNLPE 3 00 

MWBVSGYNRV DPQWAIHYSL AYTSWSQPQQ LKATSTSGDT LFQKHEGFKD AYRIALGTTY 360 

YYDDNWTFRT GIAFDDSPVP AQNRSISIPD QDRFWLSAGT TYAFNKDASV DVGVSYMHGQ 420 

SVKINEGPYQ FESE6KAWLF GTNFNYAF 448 
10 . <212> Type : PRT 

<211> Lengtli : 448 

SequenceKTame : SEQ ID 65 

SequenceDescription : 

15 Sequence 



<213> OrganistnNarae : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MAFSQAVSGL NAAATNLDVI GNNIANSATY GFKSGTASFA DMFAGSKVGL GVKVAGITQD 60 

20 FTDGTTTNTG RGLDVAISQN GPFRLVDSNG SVFYSRNGQF KLDENRNLVN MQGLQLTGYP 12 0 

ATGTPPTIQQ GANPTNISIP NTLMAAKTTT TASMQINLNS SDPLPSWAF DASNADSYNK 18 0 

KGSVTVFDSQ GNAHDMSVYF VKTGDNNWQV YTQDSSDPTG TAEPAMKLVP NANGVLTSNP 240 

TENITTGAIN GAEPATFSLS PLNSMQQNTG ANNIVATTQN GYKPGDLVSY QINDDGTWG 3 00 

NYSNEQTQLL GQIVLANFAN NEGLASEGDN VWSATQSSGV ALLGTAGTGN PGTLTNGAIjE 3 60 

25 ASNVDLSKEL VNMIVAQRNY QSNAQTIKTQ DQILNTLVNL R 401 



<212> Type : PRT 

<211> Length : 401 

SequenceName : SEQ ID 66 
SequenceDescription : 

30 

Sequence 



<213> OrganistnName z Escherichia coli 0157:H7 
<400> PreSequenceString : 

35 MSKSTFLHIL ISSIILVALX QSSAWAHCTN TQIGQTEDGR TALIEFGKIN MTDTYFAPAG 60 
SLIiATTWPP TNYTSGGATG SSVLWECDAT DLPKriYFLVA TNGDDRVGGF YDAGGPDGLS 12 0 

DVYATWPAFV GIiKQTMAGVT LGRYWKKVPI TSYA.TQGTKI QIRIiQDIPPIi HAEnYRISTL 18 0 

PDTSATTSWC GNKNTDSSGV GFAKPSGTIY jKTCVQPNAYIQ LSGTSGILFG HDEPGEDSSV 240 
HWOFWOADNTG FGYGMRSANR LYNNATCVAR SATPLVIiLPT XAEAQIiNAGM ESTGNFNVRV 3 00 

40 BCSNSVQSGI SDTQTALGIQ VSEGAYTAAQ KLGXINSNGG VSALVSDNYD AAEMAKGVGI 3 60 

YISNSAHPDT AMTLVGQPGI AKLTPGGNAA GWYPVFBGAT LBGATHPGYS SYSYSFIARL 420 
KKLPNQTVSA GKVRATAYIIi VKMQ 444 
<212> Type : PRT 
<211> Length : 444 

45 SequenceName : SEQ ID 67 

SequenceDescription : 

Sequence 



50 <213> OrganisirtName : Escherichia coli 0157 :H7 
<400> PreSequenceString : 

MENNRNPPAR QFHSLTFFAG LCIGITPVAQ ALAAEGQTNA DDTLWEAST PSLYAPQQSA 60 

DPKFSRPVAD TTRTMTVISE QVIKDQGATN LTDAJiKNVPG VGAFFAGENG NSTTGDAIYM 120 

RGADTSNSIY IDGIRDIGSV SRDTFNTEQV EVIKGPSGTD YGRSAPTGSI NMISKQPRND 180 

55, SGIDASASIG SAWFRRGTLD VNQVIGDTTA VRLNVMGEKT HDAGRDKVKN ERYGVAPSIA 240 

FGLGTANRLY LNYLHVTQHN TPDGGIPTIG LPGYSAPSAG TATLNHSGKV DTHNFYGTDS 3 00 

DYDDSTTDTA TMRFEHDIND NTTIRNTTRW SRVKQDYLMT AIMGGASNIT QPTSDVNSWT 3 60 

WSRTANTKDV SNKILTNQTKT LTSTFYTASI GHDVSTGVEP TRETQTNYGV NPVTLPAVNI 420 

YHPDSSIHPG GLTRNGANAN GQTDTFAIYA FDTLiQITRDF ELNGGIRLDN YHTEYDSATA 480 

60 CGGSGRGAIT CPAGVAKGSP VTTVDTAKSG NLVNWKAGAL YHLTENGNVY INYAVSQQPP 540 

GGNNFALAQS GSGNSANRTD FKPQKANTSE IGTKWQVLDK RLLLTAALFR TDIENEVEQN 600 

DDGTYSQYGK KRVBGYEISV AGWITPAWQV IGGYTQQKAT IKNGKDVAQD GSSSLPYTPE 660 

HAFTLWSQYQ ATDDISVGAG ARYIGSMHKG SDGAVGTPAF TEGYWVADAK LGYRVNRNLD 720 

FQLNVYNIiFD TDYVASINKS GYRYHPGEPR TFLLTANMHF 760 

65 <212> Type : PRT 

<211> Length : 760 

SequenceName : SEQ ID 68 
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SequenceDescription : 
Sequence 



5 <213> OrganistriName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MQMKKLLPIL IGLSLSGFSS LSQAENLMQV YQQARLSNPE LRKSAMRDA AFEKXNEARS 60 

PLLPQLGLGA DYTYSNGYRD ANGINSNATS ASLQIiTQSIF DMSKWRALTIi QEKAAGIQDV 120 

TYQTDQQTLI LNTATAYFNV LNAIDVLSYT QAQKEAIYRQ LDQTTQRFNV GLVAITDVQN 18 0 

10 ARAQYDTVLA NEVTARNNIjD NAVEQLRQIT GNYYPEIiAAIi NVENFKTDKP QPWALLKEA 240 

EKRNLSLLQA RLSQDIiAREQ IRQAQDGHLP TliDLTASSGI SDTSYSGSKT RGAAGTQYDD 3 00 

SNMGQNKVGL SFSLPIYQGG MVNSQVKQAQ YNFVGASEQL ESAHRSWQT VRSSFNNINA 3 60 

SISSINAYKQ AWSAQSSLD AMEAGYSVGT RTIVDVLDAT TTLYNAKQEL ANARYNYLIN 42 0 

QLNIKSALGT LKEQDLLALlJr KTAIiSKPSfiSTN PENVAPQTPE QNAIADGYAP DSPAPWQQT 480 

15 SARTTTSNGH NPFRN 495 



<212> Type : PRT 

<211> Length : 495 

SequenceNarae : SEQ ID 69 
SequenceDescription : 

20 

Sequence 



<213> OrganisitiNarae : Escherichia coli 0157:H7 
<400> PreSequenceString : 

25 MTKLKLIALG VLIATSAGVA HAEGKFSL6A GVGWEHPYK DYDTDVYPVP VINYEGDNFW 6 0 

FRGLGGGYYL WNDATDKLSI TAYWSPLYFK AKDSGDHQMR HLDDRKSTMM AGLSYAHFTQ 12 0 

YGYLRTTLAG DTIiDNSNGIV WDMAVfLYRYT NGGLTVTPGI GVQWNSENQN EYYYGVSRKE 180 
SARSGLRGYN SNDSWSPYLE LSASYNFLGD WSVYGTARYT RLSDEVTDSP IVDKSWTGLI 240 
STGITYKF 248 

30 <212> Type : PRT 

<211> Length : 248 

SequenceNarae : SEQ ID 70 
SequenceDescription : 

35 Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MKKTLLAAGA VLALSSSFTV NAAENDKPQY LSDWWHQSVN WGSYHTRFG PQIRNDTYLE 60 

40 YEAFAKKDWF DFYGYADAPV FFGGNSDAKG IWNHGSPLFM EIEPRFSIDK LTNTDLSFGP 120 

FKEWYFAMNY IYDMGRNKD6 RQSTWYMGLG TDIDTGLPMS LSMNVYAKYQ WQNYGAANENT 180 

EWDGYRFKIK YFVPITDLWG GQLSYIGFTKT FDWGSDLGDD SGNAINGIKT RTNNSIASSH 240 

ILALNYDHWH YSWARYWHD GGQWNDDAEL NFGKTGNFNVR STGWGGYLW GYNF 294 

45 <212> Type : PRT 

<211> Length : 294 

SequenceNarae : SEQ ID 71 
SequenceDescription : 

50 Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MLSTQFNRDN QYQAITKPSL LAGCIALALL PSAAFAAPAT EETVIVEGSA TAPDDGENDY 60 

55 SVTSTSAGTK MQMTQRDIPQ SVTIVSQQRM EDQQLQTLGE VMENTLGISK SQADSDRALY 12 0 

YSRGFQIDNY MVDGIPTYFE SRWNLGDALS DMALFERVEV VRGATGLMTG TGNPSAAINM 180 

VRKHATSREF KGDVSAEY6S WNKERYVADL QSPLTBDGKI RARIVGGYQN NDSWLDRYNS 240 

EKTFFSGIVD ADLGDLTTLS AGYBYQRIDV NSPTWGGLPR WNTDGSSNSY DRARSTAPDW 3 00 

AYNDKEINKV FMTLKQRFAD TWQATLNATH SEVEFDSKMM YVDAYVNKAD GMLVGPYSNY 360 

60 GP6FDYVGGT GWNSGKRKVD ALDLFADGSY ELFGRQHNLM FGGSYSKQNN RYFSSWANIF 420 

PDEIGSFYNF NGNFPQTDWS PQSLAQDDTT HMKSLYAATR VTLADPLHLI L6ARYTNWRV 4 80 

DTLTYSMEKN HTTPYAGLVF DINDNWSTYA SYTSIFQPQN DRDSSGKYLA PITGNNYELG 540 

LKSDWMNSRL TTTLAIFRIE QDNVAQSTGT PIPGSNGETA YKAVDGTVSK GVEFELNGAI 600 

TDNWQLTFGA TRYIAEDNEG NAVNPNLPRT TVKMFTSYRL PVMPELTVGG GVNWQNRVYT 660 

65 DTVTPYGTFR AEQGSYALVD LFTRYQVTKN FSLQGNVNNL FDKTYDTNVE GSIVYGAPRN 720 

FSITGTYQF 729 
<212> Type : PRT 
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<211> Length : 729 

SequenceName : SEQ ID 72 
SequenceDe script ion : 

5 Sequence 

<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MARFQFKNRK NN6LIPFISF MVMGEAAIAA PLPQWi^APA VTPVAQLSLQ ESILRAFAKN 60 
10 PGVTQQAAQI GIGEAQIDEA KSAWYPHVGL TGNAGPSRQT DSSGRLDNW SYGITLTQLV 120 
YDFGKTNISTDI NLQTAARDSY RFKLMATLTD VAEKTATAYM EVSRYQAIiCD AAQRNIHSLE 180 
NVYNMAAIiRA NAGLNSSSDE LQAQTRIAGM RSTLEQYQAQ MASAKAQIiAV LTGVQPEAIA 240 
APPAELAEQP VSLKNIDYQS IPLVLAAEKL RQSAQYGVEK TKAQYWPTLS IQGGKTRYQT 3 00 

SDRSYWDDQL QLimrAPIiYQ GGAVSAQVQQ AEGQQKISAS QVEQAKLDVL, QRASVAYAl^W 3 60 

15 TGAR6REEAG IiAQSESAHKT RDVYQNEYKL GKRSLNDLLT VEQDVFQAQS AEIKTAlSrYDGW 420 
VAAVNYAAAV NNLIPLAGIK QGLYNDIiPDL K 451 
<212> Type : PRT 
<211> Length : 451 

SequenceName : SEQ ID 73 
20 SequenceDescription : 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 

25 <400> PreSequenceString : 

MAKFTPSFSG IKGRALFSLL FAAPMIHATD TATTKDGETI TVTADANTAT EATDGYQPLS 60 

TSTATLTDMP MLDIPQWNT VSDQVLENQN ATTLDEALYN VSNWQTNTLa GGTQDAFVRR 120 

GFGANRDGSI MTNGLRTVLP RSFNAATERV EVLKGPASTL YGILDPGGLX NWTKRPEKT 18 0 

FHGSVSATSS SFGGGTGQLD ITGPIEGTQL AYRLTGEVQD EDYWRNFGKE RSTFIAPSLT 240 

30 WFGDNATVTM LYSHRDYKTP FDRGTIFDLT TKQPVNVDRK IRFDEPFNIT DGQSDLAQLN 3 00 

AEYHLNSQWT ARFDYSYSQD KYSDNQARVT AYDATTGTLT RRVDATQGSX QRMHSTRADL 3 60 

QGNVDIAGFY NEILGGVSYE YYDLLRTDMI RCKNAKDFNI YNPVYGNTSK CTTVSASDSD 420 

QTIKQESYSA YAQDALYLTD NWIAVAGIRY QYYTQYAGKG RPFNVUTDSR DEQWTPKLGL 4 80 

VYKLTPSVSL FANYSQTFMP QSSIASYIGD LPPESSNAYE VGAKFELFDG ITADIALFDI 540 

35 HKRNVLYTES IGDETIAKTA GRVRSRGVEV DLAGALTENI NIIASYGYTD AKVLEDPDYA 600 

GKPLPNVPRH TGSLFLTYDI UNMPGNNTLT FGGGGHCVSR RSATNGADYY LPGYFVADAF 660 

AAYKMKLQYP VTLQLNVKNL FDKTYYTSSI ATNNLGNQIG DPREVQFTVEC MEF 713 



<212> Type : PRT 
40 <211> Length : 713 

SequenceName : SEQ ID 74 
SequenceDescription : 

. Sequence 
45 

<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MRTLQGWLLP VFMLPMAVYA QEATVKEVHD APAVRGSIIA NMLQEHDNPF TLYPYDTNYL 60 
lYTQTSDLNK EAIASYDWAE NARKDEVKFQ LSLAFPLWRG ILGPNSVXiGA SYTQKSWWQL 12 0 

50 SNSEESSPFR ETNYEPQLFL GFATDYRFAG WTLRDVEMGY NHDSNGRSDP TSRSWNRLYT 180 
RLMAENGNWL VEVKPWYWG NTDDNPDITK YMGYYQLKIG YHLGDAVLSA KGQYNWNTGY 240 
GGAELGLSYP ITKHVRLYTQ VYSGYGBSLI DYNFNQTRVG VGVMLNDLF 289 
<212> Type : PRT 
<211> Length : 289 

55 SequenceName : SEQ ID 75 

SequenceDescription : 

Sequence 



60 <213> OrganismName : Escherichia coli 0157 :H7 
<400> PreSequenceString : 

MAVQKNVIKG ILAGTFALML SGCVTVPDAI KGSSPTPQQD LVRVMSAPQL YVGQEARFGG 60 
KWAVQNQQG KTRLEIATVP LDSGARPTLG EPSRGRIYAD VNGFLDPVDF RGQLVTWGP 12 0 

ITGAVDGKIG NTPYKFMVMQ ATGYKRWHLT QQVIMPPQPI DPWFYGGRGW PYGHGGWGWY 180 
65 NPGPARVQTV VTE 193 
<212> Type : PRT 
<211> Length : 193 
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SequenceName : SEQ ID 76 
SequenceDe script ion : 

Sec[uence 

<213> OrgcUiismName : Escherichia coli 0157 :H7 
<400> PreSec[ueiiceString : 

MRKQWLGICI AAGMtiAACTS DDGQQQTVSV PQPAVCNGPI VEISGADPRF EPLNATANQD 60 
YQRDGKSYKI VQDPSRFIQA GLAAIYDAEP 6SNLTASGEA FDPTQLTAAH PTXiPIPSYAR 120 
10 ITNLANGRMI WRINDRGPY GNDRVISLSR AAADRLNTSN NTKVRIDPII VAQDGSLSGP 180 
GMACTTVAKQ TYATjPAPPDL SGGAGTSSVS GPQGDILPVS NSTLKSEDPT GAPVTSSGFL 240 
GAPTTLAPGV LEGSEPTPAP QPWTAPSTT PATSPAMVTP QAASQSASGN FMVQVGAVSD 3 00 

QARAQQYQQQ LGQKFGVPGR VTQNGAVWRI QLGPFANKAE ASTLQQRLQT EAQLQSPITT 3 60 

AQ • , - 262 

15 <212> Type : PRT 

<211> Length : 362 

SequenceName : SEQ ID 77 

SequenceDescription : 

20 Sequence 

<213> OrganismName : Escherichia coli 0157 :H7 
<400> PreSequenceString : 

MIKRVLWSM VGLSIiVGCVN NDTLSGDVYT ASEAKQVQNV SYGTIVNVRP VQIQGGDDSN 60 
25 VIGAIGGAVL GGFLGNTVGG GTGRSLATAA GAVAGGVAGQ GVQSAMNKTQ GVELEIRKDD 120 

GNTIMWQKQ GNTRFSPGQR WLASNGSQV TVSPR 155 

<212> Type : PRT 

<211> Length : 155 

SequenceName : SEQ ID 78 
30 SeqiienceDescription : 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 

35 <4;00> PreSequenceString : 

MSKATEQNDK LKRAIIISAV LHVILFAALI WSSFDENIEA SAGGGGGSSI DAVMVDSGAV 60 

VEQYKRMQSQ ESSAKRSDEQ RKMKBQQAAE ELREKQAAEQ ERLKQLEKER LAAQEQKKQA 120 

EEAAKQAELK QKQAEEAAAK AAADAKZUCAE ADDKAAEEAA KKAAADAKKK AEAEAAKAAA 180 

EAQKKAEAAA AALKKKAEAA EAAAAEARKK AAAEKAAADK KAAEKAAAEK AAADKKAAAE 240 

40 KAAADKKAAA AKAAAEKAAA AKAAAEADDI FGELSSGKNA PKTGGGAKGN NASPAGSGNT 3 00 

KNNGASGADI NNYAGQIKSA lESKFYDASS YAGKTCTLRI KLAPDGMIiIiD IKPEGGDPAL 3 60 

CQAALAAAKL AKIPKPPSQA VYEVFKNAPL DFKP 394 
<212> Type : PRT 
<211> Length : 394 

45 SequenceName : SEQ ID 79 

SequenceDescription : 

Sequence 



50 <213> OrganismName : Escherichia coli 0157 :H7 
<400> PreSequenceString : 

MMKFKKCLLP VAMLASFTLA GCQSNADDHA ADVYQTDQLN TKQETKTVNI ISILPAKVAV 60 
DNSQNKRNAQ AFGALIGAVA GGVIGHNVGS GSNSGTTAGA VGGGAVGAAA GSMVNDKTLV 120 
EGVSLTYKEG TKVYTSTQVG KECQFTTGLA WITTTYNET RIQPNTKCPE KS 172 

55 

<212> Type : PRT 
<211> Length : 172 

SequenceName : SEQ ID 80 

Sec[uenceDe script ion : 

60 

Sequence 



<213> OrganismName : Escherichia coli 0157 :H7 
<400> PreSequenceString : 
65 MLLSIITVAF RNLE6IVKTH ASIAHLAQAE DISFEWIWD GGSNDGTREY LENLNGIYML 60 
RFVSEPDNGI YDAMNKGIAM AQGKFALFLN SGDIFHQDAA .YFVRKLKMQK DNVMITGDAL 120 
LDFGDGHKIK RSAKPGWYIY HSLPASHQAI FFPVSGLKKW RYDLEYKVSS DYALAAKMYK 180 
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AGYAFKKLNG LVSEFSMGGV STTNNMELCA DAKKVQRQIL HVPGFWAELS WHLRQRTTSK 240 
TKALYNKS 248 
<212> Type : PRT 
<211> Length : 248 
5 SeguenceMame : SEQ ID 81 

SequenceDe script ion : 

Sequence 



10 <213> OrganismWame : Haemophilus influenzae Rd 
<400> PreSequenceString : 

MKLTTLQTLK KCFTLIELMI VIAIIAILAT lAIPSYQNYT KKAAVSELLQ ASAPYKADVE 60 
LCVYSTNETT SCTGGKNGIA ADIKTAKGYV ASVITQSGGI TVKGNGTLAN MEYILQAKGN' 120 
AAAGVTWTTT CKGTDASLFP ANFCGSVTK . - . 14.9 

15 <212> Type : PRT 

<211> Length : 149 

SequenceName : SEQ ID 82 

SequenceDescription : 

20 Sec[uence 



<213> OrganismName : Haemophilus influenzae . Rd 
<40 0> PreSequenceString : 

MLNKKFKLNF lALTVAYALT PYTEAALVRD DVDYQIFRDF AENKGRFSVG ATNVEVRDKNT 60 

25 NHSLGNVLPN GIPMIDFSW DVDKRIATLI NPQYWGVKH VSNGVSELHF GNLNGNMNNG- 120 

NAKSHRDVSS EENRYFSVEK NEYPTKLNGK AVTTEDQTQK RREDYYMPRL DKFVTEVAPX 180 

EASTASSDAG TYNDQNKYPA FVRLGSGSQF lYKKGDNYSL ILNNHEVGGN NLKLVGDAYT 240 

YGIAGTPYKV NHENNGLIGF GNSKEEHSDP KGILSQDPLT NYAVLGDSGS PLFVYDREKG- 3 00 

KWLFIiGSYDF WAGYNKKSWQ EWNIYKPEFA KTVLDKDTAG SLTGSNTQYN WNPTGKTSVX 3 60 

30 SNGSESLNVD LFDSSQDTDS KKKINHGKSVT LRGSGTLTLN NNIDQGAGGL FFEGDYEVKG 420 

TSDSTTWKGA GVSVADGKTV TWKVHNPKSD RLAKIGKGTL IVEGKGENKG SLKVGDGTVX 480 

LKQQADANNIC VKAFSQVGIV SGRSTWLND DKQVDPNSIY FGFRGGRLDA NGNNLTFEHr 540 
RNIDDGARLV NHNTSKTSTV TITGESLITD PNTITPYNID APDEDNPYAF RRlKDGGQrL"!!r 60 0 

LMLENYTYYA LRKGASTRSB LPKNSGESNE NWLYMGKTSD EAKRNVMNHI HNERMNGF.NG 660 

35 YFGEEEGKMUT GNLNVTFKGK SEQNRFLLTG GTNLNGDLKV EKGTLFLSGR PTPHARDIAG 720 
ISSTKKDQHF AENNBVWED DWINRNFKAT NINVTNNATL YSGRNVANIT SNITASDISTAK: 780 
VHIGYKAGDT VCVRSDYTGY VTCTTDKLSD KALNSFNATN VSGNVNLSGN ANFVLGKANIj 840 
FGTISGTGNS QVRLTENSHW HLTGDSNVNQ LNLDKGHIHL NAQNDANKVT TYNTLTVKTSIa 900 
SGNGSFYYLT DLSNKQGDKV WTKSATGNF TLQVADKTGE PTKNELTLFD ASNATRMJLKT 960 

40 VSLVGNTVDL GAWKYKLRNV NGRYDLYNPE VEKRNQTVDT TNITTPNMIQ ADVPSVPSlsTbT 1020 

EEIARVETPV PPPAPATPSE TTETVAENSK QESKTVEKNE QDATETTAQN GEVAEEAKPS 108 0 

VKANTQTNEV AQSGSETEET QTTEIKETAK VEKEEKAKVE KDEIQEAPQM ASETSPKQAKl 1140 

PAPKEVSTDT KVEETQVQAQ PQTQSTTVAA AEATSPNSKP AEETQPSEKT NAEPVTPWS 1200 

KNQTENTTDQ PTEREKTAKV ETEKTQEPPQ VASQASPKQE QSETVQPQAV LESENVPTWT 1260 

45 NAEEVQAQLQ TQTSATVSTK QPAPENSINT GSATAITETA EKSDKPQTET AASTEDASQH 13 20 

KANTVADNSV AiJNSESSDPK SRRRRSISQP QETSAEETTA ASTDETTIAD NSKRSKPNRR 13 80 

SRRSVRSEPT VTNGSDRSTV ALRDLTSTNT NAVISDAMAK AQFVALNVGK AVSQHISQLE 1440 

MNNEGQYNW VSNTSMNENY SSSQYRRFSS KSTQTQLGWD QTISNNVQLG GVFTYVRNSISI 1500 

NFDKASSKNT LAQVNFYSKY YADNHWYLGI DLGYGKFQSN LKTNHNAKFA RHTAQFGLTJ\ 1560 

50 GKAFNLGNFG ITPIVGVRYS YLSNANFALA KDRIKVNPIS VKTAFAQVDL SYTYHLGEFS 1620 

VTPILSARYD TNQGSGKINV NQYDFAYNVE NQQQYNAGIiK LKYHNVKLSL IGGLTKAKQA 168 0 

EKQKTAELKL SFSF 1694 
<212> Type : PRT 
<211> Length : 1694 

55 SequenceName : SEQ ID 83 

SequenceDescription : 



Sequence 



60 <213> OrganismName : Haemophilus influenzae Rd 
<400> PreSequenceString : 

MALVNKIKTL SSVGILAATL FLAGCQAQSN ILAFTPPAPS ASMNVNRTAV VSVTTKDSRA 60 
IQEIASYTKH GELIKLNASP SVTQLFQQVM QQNLISKGFR VGQLNGSNAW VTVOVREEGT 120 
QVEQGNLRYK LNTKIQATVY VQGAKGSYNK SFNVTHSQEG VFNAGNDEIH KVLSQTFNDI 180 
65 VNNIYQDQEV AAAINQYSN 199 
<212> Type : PRT 
<211> Length : 199 
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SequenceName ; SEQ ID 84 
SequenceDe script ion : 

Sequence 

<213> OrganismNatne : Haemophilus influenzae Rd 
<4 00> PreSequenceString : 

MLCWIGYKNG ILPQQNSTLY PWLMPSKCGV IFDGFQLVGD DFNSDQTAEN TSPAWQVXjYT 60 
THLQSCSPIH SGENFAPIPL YKQLKNQPHL SQDLIKWQEN WQACDQLQMN GAVLEQQSLA 120 
EXSDHQSTLS KHGRYIiAQEI EKETGIPTYY YLYRVGGQSL ESEKSRCCPS CGAJSTWAIjKDA 18 0 

IFDTFHFKCD TCRLVSNLSW NFL 2 03 

<212> Type : PRT 
<211> Length : 203 

SequenceHame : SEQ ID 85 

SequenceDescription : 

Seq[uence 



<213> OrganismName : Haemophilus influenzae Rd 

<4 00> PreSequenceString : 

MGAFAFASVT NANIYAEGDI GLSQTKANGS NNTRVGPRVS VGYKVGNTRV AGDYTHHGKV 60 
DGTKIQGLGA SVLYDFDTNS KVQPYVGARV ATNQFKYTNR AEQKFKSSSD IKLGYGWAG 120 
AKYKLDGNWY ANGGVEYWRL GNFDSTKVNN YGAKVGVGYG P 161 
<212> Type : PRT 
<211> Length : 161 

SequenceName : SEQ ID 86 

SequenceDescription : 

Sequence 



<213> OrganismName : Haemophilus influenzae Rd 
<400> PreSequenceString z 

MKKLLIASLL FGTTTTVFAA PFVAKDIRVD GVQGDLEQQI RASLPVRAGQ RVTDNDVANI 60 

VRSLFVSGRF DDVPCAHQEGD VLWSWAKS IISDVKIKGN SXIPTEALKQ NLDANGFKVG 120 

DVLIREKLNE FAKSVKEHYA SVGRYNATVE PIVNTLPNNR AEILIQINED DKAKLASLTF 180 

KGNESVSSST LQEQMELQPD SWWlCLWGNKF EGAQFEKDLQ SIRDYYLNNG YAKAQITKTD 240 

VQLNDEKTKV NVTIDVNEGL QYDLRSARII GNLGGMSAEL EPLLSALHLN DTFRRSDIAD 3 00 

VENAIKAKLG ERGYGSATVN SVPDFDDANK TLAITLWDA GRRLTVRQLR FEGNTVSADS 3 60 

TLRQEMRQQE GTWYNSQLVE LGKIRLDRTG FFETVENRID PINGSNDEVD WY-JCVKERNT 420 

GSINFGIGYG TESGISYQAS VKQDNFLGTG AAVSIAGTKN DYGTSVNLGY TEPYFTKDGV 48 0 

SLGGNVFFEN YDNSICSDTSS KTYKRTTYGSN VTLGFPVNEN NSYYVGLGHT YNKXSNFALE 540 

YNRNLYIQSM KFKGNGIKTN DFDFSFGWNY NSLNRGYFPT KGVKASLGGR VTIPGSDNKY 60 0 

YKLSADVQGF YPLDRDHLWV VSAKASAGYA NGFGNKRLPF YQTYTAGGIG SLRGFAYGSI 660 

GPNAIYAEHG NGNGTFKKIS SDVIGGNAIT TASAELIVPT PFVSDKSQNT VRTSLFVDAA 72 0 

SVWNTKWKSD KSGLDNNVLK SLPDYGKSSR IRASTGVGFQ WQSPIGPLVF SYAKPIKKYE 780 

NDDVEQFQPS IGGSF 795 
<212> Type : PRT 
<211> Length : 795 

SequenceName : SEQ ID 87 

SequenceDescription : 

Sequence 



<213> OrganismName : Haemophilus influenzae Rd 
<400> PreSequenceString : 

MLKKTSLIFT ALLMTGCVQN ANVTTPQAQK MQVEKVDKAL QKGEADRYLC QDDRWRWH 60 
ATHKKYKKNL HYVTVTFQGV SEKLTLMISE RGKNYANIRW MWQERDDFST LKTlSrLGEILA 120 
TQCVSQTSSR LSGQ 134 
<212> Type : PRT 
<211> Length : 134 

SequenceName : SEQ ID 88 

SequenceDescription : 

Sequence 



<213> OrganismName : Haemophilus influenzae Rd 
<400> PreSequenceString : 
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MRIIIIFFMG LNMTNFRLER ACLFRYAWAN GRCCLCSSTN QPTNQPTMQP TNQPTNQPTIT 6 0 

QPTMQNSNVS EQLEQINVSG STENSDTKTP PKIAETVKTA KTLEREQAISFW IKDIVKYETG a20 

VTWEAGRFG QSGFAIRGVD ENRVAINIDG LRQAETLSSQ GFKELFEGYG NFNNTRNGAE X8 0 

lETLKEVNIT KGADSIKNGS GSLGGSVIYK TKDARDYLIN KDYYVSYKKG YATENNQSFD 2 40 

TLTLAGRYKK FDVLWTTSR NGHELENYGY KNYNDKIQGK KREKADPYKI EQDSTLLKLS 3 00 

FNPTENHRFT FAADLYEHRS RGQDLSYTLK YQRSGMETPE VDSRHTNDKT KRRNISFSYE 3 60 

NFSQTPFWDT LKLTYSDQRI KTRARTDEYC DAGVRHCEGT DNPTGLKVTN GKITRRDGSD 42 0 

LQFEEKNNTA KSSDKTYDFK KFIDTDKRVI DDKLVLNNPS DTWYDCSIFN CENNAKIKVF 4 80 

KGNNYYGYDG KWKEVDLEIK ELNGKKFAKI KDNDRKIKSI LPSSPGYLER LWQERDLDTN 540 

TQQLNLDLTK DFKIWHIEHN LQYGGSYNTA MKRMVNRAGN DASDVQWWAT PTLGEDSWTG SOO 

KPHTCATTYE WNANLCPRVD PEFSYLLPIK TTGKSVYLFD NFVITDYLSF DLGYRYDMIH 660 

YQPKYKHGIT PKLPDDIVKG LFIPLPNNSN SDPNKVKENV QQNIDYIAKQ MKKYKAHSYS 720 

FVSTIDPTSF LRLQLKYSKG FRTPTSDEMY FTFKHPDFTI LPNTDLKPEI AKTKEIAFTL "7 8 0 

HNDDWGFIST SL^FKTNYK^JF IDLIFKKQET FI^I^^GGSGRGE TLPFSLYQNI NRDNASI^KGI P4Q 

EINSKVFLGK MAKFMDGFNL SYKYTYQKGR MNGNIPMNAI QPRTMVYGLG YDHPNHKFGF 300 

DFYTTHVASK NPEDTYNMFY KEENKICDSTI KWRSKSYTIIi DLIGYVQPIK JILTIRAGVYN 360 

LTNRKYITWD SARSIRSFGT SNVIDQSTGIi GIliTRFYAPGR NYKMSVQFEF lOlO 
<212> Type : PRT 
<211> Length : 1010 

SequenceName : SEQ ID 89 

SeqfuenceDescription : 

Sequence 



<213> OrganismName : Helicobacter pylori JSS 
<400> PreSequenceString : 

MTYRNGKIDL KERFSKNRSF KGIKKKIAKK YTIKNSLSII YSLKTHSNSS LSINKKIFLG 60 

LGFVSAIiSAQ SEDYNSSVYW LNSVNENNNN KSYYISPLRT WAGGNRSFTQ NYNNSQLYIG 120 

TKNASATPNH SSVWFGEKGY IGFITGVFKA RDIFITGAVG SGNELKTGGG AILVFESSNE 180 

LTTNGAYFQN NRAGTQTSWI NIiISNNSVNL TNTDFGNQTP NGGFNVMGRK ITYNGGSVNG 240 

GNFGFDNVDS NGATTISGVT FNNNGALTYK GGMGIGGSIT FTNSNINHYK LNLNANSVTF 3 00 

NNSTLGSMPN GNANTIGNAY ILNANNITFN NLTFNGGWFV FNRSDAHVNF QGTTTINNPT 3 60 

SPFVNMTGKV TINPNAIFNI QNYTPTIGNA YTLFSMKMGN lAYDDVNNLW NIIRLKNTQA 42 0 

TKDNSICNATS NNNTHTYYVT YNLGGTIiYHF RQIFSPDSIV LQSVYYGANN LYYTNSVMIH 48 0 

DNVFNLKNIN DDRADTIFYL NGLNTWNYTQ ARFAQTYGGK NSALVFNATT PWANGAIPKS 540 

NSTVRFGGYE GVNWGKTGYI TGTFTADRVY ITGNMMSGNG AQTGGGATLN FVGATEIMIA ^0 0 

GATFKNLKTT SQNSYMTFMA LGNGSGSGKI NVSQSDFYDW TDGGYDFTGN GVFDSVNFNK 660 

AYYKFQGAEN SYNFKNTNFL AGNFKFQGKT TIEKSVLNDA SYAFDGVNNA FNEDKFMGGS V20 

FNFNHAEQTN AFNNNSFSGG SFSFNAKQVD FNGNSFNGGV FNFNNTPKAS FTNDTFNVNN 780 

QFKINGAQTD FTFSKGWFN MQGLLSSLSV GTTYQLLNAK SVGYKDNIJHA LYQMLRWTSG 840 

ENPSGKLVDE NKTAPNSAKI YNVQFTDNGL TYYIKENFNN GITIiTRLCTL GYTHCVKIDN 900 

DAFNLKNVNN MASNTVFYLN GMTTWKTAGT GVFTQDYSGT NSVLVFNQTT PFLAGANPTS 960 

NSWGFGKTS GAEWGLVGYI QGVFKANQID ITGTIRSGNG AKTGGGATLV FNAQERLNIA 102 0 

NANLNNDKAG LQNSWMNFIV miGNLNVTNA NFSNQTPHGG FNLKANNITW DKGSVSGGGN 1080 

FGVDNANANG NAVIKNVNFS DNGTLIYKGG ENSAGNSLTL ENNTFNSYNI NAKAQNIIFM 1140 

NNSFNSGSYS FNDTKNVTFK GTNTLINSDP FSRLKGSVSI DNNSIPNIER DLTDKTTYTL 1200 

LSGDNIKYNN QAIADNVFSK NLWDIiIHYDG EQGTLLRTDM NTYFVQFTQS NGQKFVFEET 12 60 

FNPGSITYKY FTIHSSPFHT EADSKDIWNQ VRKQFDFIPG KTPVCVGVCY lAPYKNQDLI 132 0 

GSSAFAWSLM FGATWGTLL LGSAQEKANN NGGSIWFGKN NLLYLHGNFN ATNIFLTNNF 1380 

NVGNPNAGGG ATINFNADET LSADGLNYTN FQTVAMGLQT SASQHSWANF NSKLSMEIKN 1440 

SNFRDFTWGG FRFNSGRITF ENTTFSGWTN INGATESGSS YVNMVANTDL IFTDSILGGG 1500 

IRYDLKANNI IFNNTQMWD VSKNVNQSSL NGNVTFNHSR LSVKPNAAIN IGGDQTQTTL 1560 

ENASSLSFYN DSVANFNGTT AFNGVSYLNL NPNAQVSFNQ ANFNNANVTP YGIPLFGKTP 162 0 

NFGNSVRLIN FKGDAKFNQA TLNLRAKNIH LNFQGASTPE NNSTMNLAES SQASFNALSV 168 0 

EGETNFNLNG SSLLSFNGNS VFNAPVNFYA NNSQISFTHS ATFNADASFD LGNNSTLNFQ 174 0 

SVLLNSALNIi LGMGGNNLAI NAKGNFSFGS QGILNLSYMN LFGGDKKASV YDVLQAQMID 180 0 

GLRGNNGYEK IRFYGIQIEK ADYSFNNGVH SWSFTNPLNT TETITETLHN NRLKVQISQN 1860 

GASNNAMFNL APSLYDYQQM PYDESENSYN HTSDKAGTYY LSSSIKGFGK NNEIPGTYNA 192 0 

QNQPLQALHI YNQAISKQDL NMIASLGKEF LPKVAKLIAS GALDNLNLNS PDSFETIFSI 1980 

LKEYGITLNQ ANWKSLLKII NNFSNTANYH FSQGSLWGA IKEGQTNTNS WWFGGDGYK 2 04 0 

NPCAVGDNTC QMFRQTNLGQ IiLNSSVPYLG YINANFKAKN lYITGTIGSG NAWGSGGSAN 2100 

VSFESATNLV LNQANIDAQG TDKIFSYLGK EGIDKLFGEK GLGNVLSNIV YEESINDNAI 2160 

PKDLANMIPK DLGSKTLSSL LSPTBVNNLL GVSAFKNAIM EILNSKTVGD VFGENGLLNA 2 22 0 

LDPVKRKEID QMLLEQIQAH SSGFEKFIVK TLGIENVENF INNWYGKQSL SSFAMNFVPG 2 28 0 

GLNQALDKIG SSSDAKDLQS FLDKTTFGDI LNQMINQAPL INKLISWLGP QDLSVLWIA 2 3 40 

LNSITNPSKE LLGAISGMGQ KVLNDLLGEG WNKIMSNQV LGQMINKIIA DKGFGGVYHQ 2400 

GLGSILPKSL QDELKKL6MG SLLKPKGLHN LWQKGNFNFV AKNHVFVNNS LFSNATGGEL 2 460 
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NFVAGKSIIF NGKNTINFTQ YQGRLSFVSK DFSNISLDTL NATNGLTLNA SKNDISVQKG 252 0 

QICVWVLDCM TAKGKTTQTN SSSSATAPTN ETLEVSANNF AFLGTIKANG LVDFSKVLQN 258 0 

TTIGTLDLGP NATFKANNLI VNITAFISINNSN YRANISGNFN VAKGATFSTM ENGLNVGGNF 26^0 

NSEGPLIFNL NNPTHQTIIN VTGTSTIMSY NNQALINFNT QLKQGAYTLI NANRMVYGYD 27 OO 

5 NQTILGGSLS DYLKLYTLID FNGKRMQLNG DSLSYDMQPV SIKDGGLiWS FKDNQGQMVY 2760 

SSILYDKIQV TVSDKPMSIQ APSLEYYVKR IQGSAGLNAI KSAGNNSIMW LSELFAAKGG 282 0 

NPLFAPYYLQ DNPTEHIVTL MKDITSALGM LSNSNLKNNS TDVLQLNTYT QQMSRLAKLS 2 88 0 

NFASFDSTDF SERLSSLKNQ RFADAVPNAM DVILKYSQRD KLKNNLWATG VGGVSFVENG 294 0 

TGTLYGVNVG YDRFVRGVIV GGYAAYGYSG FYERITSSKS DNVDVGLYAR AFIKKSELTF 30 OO 

10 SVNETWGANK TQISSNDALL SMINQSYKYS TWTTNAKVNY GYDFMFKNKS IIIiKPQIGLR 3 06 0 

YYYIGMSGLE GVMNNVLYNQ FKADTADPSKK SVLTIDFALE NRHYFNTNSY FYAIGGVGRD 312 0 

LLVHSMGDKL VRFIGNNTLS YRKGDLYNTF ANITTGGEVR LFKSFYANAG VGARFGLDYK 31 BO 

MIDIXGNIGM RLAF 3194 
<212> Type : PRT - 

15 <211> Length : 3194 

SequenceName : SEQ ID 90 
SequenceDescription : 

Sequence 

20 

<213> Organi smEsTame : Helicobacter pylori J99 
<400> PreSequenceString : 

MKQFKKKPKK IKRSHQNQKT ILKRPLWIiMP LLIGGFASGV YADGTDILGL SWGEKSQKVC 60 

VHRPWYAIWS CDKWEEKTQQ FTGNQLITKT WAGGNAANYY HSQNNQDITA NLKNDNGTYF 12 0 

25 LSGLYNYTGG EYNGGNLDIE LGSNATFNLG ASSGNSFTSW YPNGHTDVTF SAGTINVNNS 180 

VEVGNRVGSG AGTHTGTATIj NLNANKVTIN SNISAYKTSQ VNVGNANSVI TINSVSLNGD 2 40 

TCSSLARVGV GANCSTSGPS YSFK6TTNAT NTTFSNSSGS FTFEENATFS GAKLNGGAFT 3 O 0 

FNKKFNATNN TAFNSGSFTF KGTSSFNGAN FSNASYTFNN QATFQNSSFN GGTFTFNDQT 3 60 

NQSTQHPQIQ NSSFSGSATT LKGFATFEQA FNNSNHQLTI QNASFNNATF NNTGKITIEK 4 20 

30 DASFNNTSFN TPVDTNNMTI SGGVTLSGKN DLKNGATLDF GSSKITLTQG TTFNLTSLGS 4 80 

EKSVTILNSR GGITYNHLLN HAINSLTNAL KTNESSSKPQ SFAQGLWDMI TYNGVTGQLL 5 40 

NENAATSKPT DSSPSKSSTW STQVYQVGYK IGDTXYKLQE TFSHNSIIIQ ALESGTYTPP 6 00 

PVINGSKFDL SASNYINADM PWYNHKYYIP KSQNFTESGT YYLPSVQIWG SYTNSFKQTF 6 60 

SASNSNLVIG YNATWTDHNV SSSDTVAFGD TSGSALNGHC GPWPYYQCTG TTNGTYSAYH 72 0 

35 VYITANIiRSG NRIGTGGAAN LIFNGVDSIN lANATITQHN AGAYSSSMTF STQNMDNSQN 7 80 

LNGLNSNGKIi LVYGTTFTNQ AKDGKFIFNA GQATFENTNF NGGSYQFSGD SLNFSNlSfNQF 8 40 

NSGSFEIGAK NTIFNNANFN NSTSFNFNNS SATTSFVGDF TNANSNLQIA GNAVFGNSTN 9O0 

GSQNTAKTFNUX TGSVNIAGNA TFDNWFNSP TNTSVKGKVT LNNITLKNLN APLSFGDGTI 9 60 

VFSAHSVINI GEAITNGNPI TLVSSSKAIE YNDAFSKNLW QLINYQGHGA SSEKQVSSAG 10 2 0 

40 NGVYDWYSF NNQTYNFQEV FSPNSISIRR L6VGMVFDYV DMEKSDRLYY QNALGFMTYM 10 8 0 

PNSYNNNLGN LNNTIYYYDN SIDFYASGKT LFTKAEFSQT FTGQNSAIVF GAKKTIWTSVS 114 0 

DAPQSNVIIR FGDNKGAGSN DASGHCWNLQ CIGFITGHYE AQKIYITGSI ESGNRISSGG 12 00 

GASLKTFNGLQ GILLTNATLY NRAAGTQSSS MNFVSNSANI QAQNSYFIDD TAQNKGNPNF 12 60 

SFNALNLDFS NSSFRGYVGQ TQSVFKFNAV NAISFTNSSN LSSGLYQMQA KSVLFDNSNL 13 2 0 

45 SVSVGTSSIK ANAINLSQNA SINASNHSTL ELQGDLNLND TSSLNLNQSA INVSNNATIN 13 80 

DYASLIASNG SHLNFNGAVN FNSANITTSL SSSSIVFKGA VSLRGQFNLS NNSSLDFQGS 1^4 0 

SAITSNTAFN FYDNAFSQSP ITPHQALDIK VPLSLGGNLL NPNNSSVLNL KNSQLVFSDQ 15 00 

GSLNIANIDL LSDLNGNKNR VYNIIQADMN GNWYERINFF GMRINDGIYD AKNQTYSFTN 15 60 

PLNNALKITE SFKNNQLSVT LSQIPGIKNT LYNIGSEIFN YQKVYNNANG VYSYSDDAQG 16 20 

50 VFYLTSSVKG YYNPNQSYQA SGSNNTTKNN NLTSESSVIS QTYNAQGNPI SALHVYNKGY 16 80 

NFSNIKALGQ MALKLYPEIK KILGNDFSLS SLSNLKGDAL NQLTKLITPS DWKNINELID 17 40 

NANNSWQNF NNGTLIIGAT KIGQTDTNSA WFGGLGYQK PCDYTDIVCQ KFRGTYLGQL 18 00 

LESISADLGY IDTTFNAKEI YLTGTLGSGN AWGTGGSASV TFNSQTSLIL NQANIVSSQT 18 60 

DGIFSMLGQE GINKVFMQAG LANILGEVAM QSINKAGGLG NLIVNTLGSD SVIGGYLTPE 19 20 

55 QKNQTLSQLL GQNNFDNIiMN DSGLNTAIKD LIRQKLGFWT GLVGGLAGLG GIDLQNPEKL 19 80 

IGSMSINDLL SKKGLFNQIT GFISANDIGQ VISVMLQDIV KPSDALKNDV AALGKQMIGE 2O40 

FLGQDTLNSIi ESLLQNQQIK SVLDKVXiAAK GLGSIYEQGL GDLIPNLGKK GIFAPYGLSQ 21_00 

VWQKGDFSFN AQGNVFVQNS TFSNANGGTL SFNAGNSLIF AGNUHIAFTN HSGTLNLLSN 23-60 

QVSNINVTML NASNGLKINA TNNNVSVSQG NLFINASCVQ QSDPTTASAT NPCTTAQNNA 22 20 

60 SSSNASNNAP lALNNNDESL WTANGFNFS GNIYANGWD FSKIKGSANV KNLYLYKNAQ 22 80 

FQANNLTISN QAVLEKNASF VTNIjTLNIQGA FNNNATQKIE VLQNLVIASN ASLSTGIYGL 23 40 

EVGGALNNLG AIHFNLENSQ TPVNPLIQVG GIINLNTTQT PFMNVSVANG GTYTLLKSSR 24 00 

YIDYNINPNS LQSYLKLYTL ININGNHIEE KNGVLTYLGQ RVLLQDKGLL LSVALPNSNN 24 60 

ASQNNILSLS VLHNQIKMSY GNKVMDFTPP TLQDYIVGIQ GQSALNQIEA VGGNNAIKWL 25 2 0 

65 STLMMETKEN PLFAPXYLEM HSLNEILGVT KDLQNTASLI SNPNFRNNAT SLLEMASYTQ 25 8 0 

QTSRLTKLSD FRAREGESNF SERLLELKNK RFSDPNPSEV FVKYSQLSKH PNNIiWIQGVG 2640 

GASFISGGNG TLYGLNVGYD RLVKSVILGG YVAYGYSGFN GNIMHStiANN VDVGMYARAF 27 00 
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LKRNEFTLSA NETYGGNASH INSSNSLLSV LNQRYNYNTW TTSWGNYGY DPMFKQKSW 2760 
LKPQVGLSYH FIGLSGMKGK MQNPAYQQFV MHSNPSNESV LTLNMGLESR KYFGKNSYYF 2620 
VTARLGRDLL IKAKGDNWR FVGENTLIiYR KGEIFNTFAS VITGGEMHLW RLMYVNAGVG 2880 
LKMGLQYQDL NITGNVGMRV AP 2902 
<212> Type : PRT 
<2X1> Length : 2902 

SequenceNarae : SEQ ID 91 

SequenceDescription : 

Sequence 



<213> OrganismName : Helicobacter pylori J99 
<4 00> PreSequenceString : 

MAFKKARLIS RFISKGSFKL NKISKKFFTL NQILKRFKPL TCTRHKKTKSIE KPFNKNKSFL * 

KASVLLIGAL GGLSHLRANE CRYWSWSSWS YQDNIESGPN SPTHNSYCLF SSAQGSGTYY 120 

LNTLTTYSAG GASFTQKFNG GTLDIGGNIR FGGTGINGGD VGYITGTYNA QTMNFNSSHI 180 

TTGNSYADGG GTTLNFNATN NITINQASFD NSDAGTQKSY MNFKGSNIKI SGSSFTDDTN 240 

GGFNFSGNNN NSTISFNQTS FNQGTYNFSN SATLSFNNSN FNQGTYHFNS AQSTFENSNF 300 

NQGTYNFNDN TSFNNDTFNQ GTYNFNSSKV SFSGANTLNS SSPFASLKGS VSFNSGAIFN 3 60 

LNQTLNNNQT YDILTTNGAI QYGVYQSYLW DLINYKGDKA ISHVEVSNNT YDVTFDINGQ 420 

DETLQETFSN QSIITQFLGD DLQQQAQQTY QEDVANSQNA LNKVASDNTI ANNDTSYTQS 480 

SNPTILKDAQ GLENTNQQIQ QDEKALEKDL AQIKQIjAMST TGFNEQAFTQ AQKQEQQDEQ 540 

ALQNDENAFN TEQEGLEQAI ANAKHANPTP NPTPSPTPTP IKHTAPNTPP SQVPPTPPSQ 600 

NLPKTNVWNG VYWLQNKTYS NKGIYYIDPN LSGQSGQSGN TLSTYTANLL GRSFGVNANN" 660 

GTLIIGNNTE SVNDNGLIWI GHGGFGYITG TFSAANIYLT NNFKTGEGVS NSDGGGANIT 720 

FKASDNITMD GLNYNNAETV TKMIQTGASQ HSYTTFDATN NISVTDSDFS DMTWGKFSFS 780 

AKNISPSNAS FSGFTNPGGS STISTNASNS LSFTDSRLNG GAIYNLQANS LIFNNTQAVF 840 

NVLYSRGTSN FNATTQLLGM TSFTLSSQSL LNFNGDTTLQ NNANITLGNK SQAAFKNSLT 900 

LDNNSNLSLD NQSVLNANGT SAFNNQASLN lYNGSQAAFS SLFFNGGTLS LNANSKLNAS 960 

SASFSNNTTI NLDDSVLNAM NTSSLMANIN FQGASQADFG GNTTIDTASF NFDSASSLNF 1020 

NNLTANGALN FN6YAPSLTK ALMNVSGQFV LGNNGDINLS DINIFDNITK SVTYNILMAQ 1080 

KGITGISGAN GYEKILFYGM KIQNATYSDN NNIQTWSFIN PLNSSQIIQE SIKNGDLTIE 1140 

VLNNPNSASN TIFNIAPELY NYQDSKQNPT GYSYDYSDNQ AGTYYLTSNI KGLFTPKGSQ 12 00 

TPQTPGTYSP FNQPLNSLNI YNKGFSSENL KTLLGILSQN SATLKEMIES NQLDNITNIN 12 60 

EVLQLiLDKIK ITQAQKQALIa ETINHLTDNI NQTFNNGNLV IGATQDNVTN STSSIWFGGN 1320 

GYSSPCALDS ATCSSFRNTY LGQLLGSTSP YLGYINADFK AKSIYITGTI GSSNAFESGG 13 80 

SADVTFQSAN NLVLNKANIE AQATDNIFNL LGQEGIDKIF NQGNLANVLS QMAMEKIKQA 1440 

GGLGNFIENA LSPLSKELPA SLQDETLGQL IGQNNLDDLL NNSGVMNEIQ NIISQKIiSIF 1500 

GNFVTPSIIE NYLAKQSLKS MLDDKGLIiNF IGGYIDASEL SSILGVILKD ITNPPTSLQK 1560 

DIGWANDLL NEFLGQDWK iCLESQGLVSN IINNVISQGG LSGVYNQGLG SVLPPSLQNA 1620 

LKENDIiGTLIi SPRGLHDFWQ KGYFNFLSNG YVFVNNSSFS NATGGSLNFV ANKSIIFNGD 1680 

NTIDFSKYQG ALIFASNGVS MINITTLNAT NGLSLNAGLN NVSVQKGEIC INLANCPTTK 1740 

NSSPANSSVT PTNESLSVHA NNFTFLGTII SNGAIDLSQV TNNSVIGTLN LNENATLQAN 1800 

NLTITNAFNN ASNSTANIDG NFTLNQQATL STNASGLNVM GNFMSYGDLV FNLSHSVSHA 18 60 

IINTQGTATI MANNNPLIQF NASSKEVGTY TLIDSAKAIY YGYNNQITGG SSLDNYIiKLY 1920 

ALIDINGKHM VMTDNGLTYN GQAVSVKDGG LWGFKDSQN QYIYTSILYN KVKIAVSNDP 1980 

INNPQAPTLK QYIAQIQGVQ SVDSIDQAGG NQAINWLNKI FETKGSPLFA PYYLESHSTK 2040 

DLTTIAGDIA NTLEVIANPN FKNDATNILQ INTYTQQMSR LAKLSDTSTF ARSDFLERLE 2100 

ALKNKRFADA IPNAMDVILK YSQRNRVKNN WATGVGGAS FISGGTGTLY GINVGYDRFI 2160 

KGVIVGGYAA YGYSGFHANI TQSGSSNVNV GVYSRAFIKR SELTMSLNET WGYNKTFINS 2220 

YDPLLSIINQ SYRYDTWTTD AKINYGYDFM FKDKSVIFKP QVGLSYYYIG LSGLRGIMDD 2280 

PIYNQFRANA DPNKKSVLTI NFALESRHYP NKNSYYFVIA DVGRDLFINS MGDKMVRFIG 2340 

NNTIiSYRDGG RYNTFASIIT GGEIRLFKTP YVNAGIGARP GLDYKDINIT GNIGMRYAF 2399 

<212> Type : PRT 
<211> Length : 2399 

SequenceNarae : SEQ ID 92 

SequenceDescription : 

Sequence 



<213> OrganismName : Helicobacter pylori J99 
<400> PreSequenceString : 

MEIQQTHRKI NRPLVSLVLA GALISAIPQE SHAAFFTTVI IPAIVGGIAT GTAVGTVSGL 60 

LSWGLKQAEE ANKTPDKPDK VWRIQAGKGF NEFPNKEYDL YKSLLSSKID GGWDWGNAAR 120 

HYWVKGGQWN KLEVDMKDAV GTYKLSGLRN FTGGDLDVNM QKATLRLGQF NGNSFTSYKD 180 

SADRTTRVNP NAKNISIDNF VEINNRVGSG AGRKASSTVIi TLQASEGITS SKHAEISLYD 240 
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GATLNIiASNS VKLNGNVWMG RLQYVGAYLA 
GIIASNKTHI GTLDLWQSAG LNIIAPPEGG 
VINPPNNTQK TETEPTQVID GPFAG6KDTV 
MIGKGGVNLS NQASGRTLLV ENLTGNITVD 
5 NGTATFNNDI SLGRFVNLKV DAHTANFKGI 
VAVKNFNIME LIVKTMGISV GEYTHFSEDI 
LVINDFYYSP WNYFDARNVK NVEITRKFAS 
SNLTIQGDFI NNQGTINYLV RGGKVATLMV 
LIKMTEHVLL KAKIIGYGNV STGTNGISNV 

10 KACGMAI GKTQ SMVNNPDNYK YLIGKAWRNI 
LPTNTTNNAH SANYAIiVKNA PFAHSATPNIi 
S6AQ6RDLLQ TLLIDSHDAG YARQMIDNTS 
LSLSNAMILN SRLVNLSRKH TNHINSFAQR 
NVWANAIGGA SLNSGSNASL YGTSAGVDAF 

15 AlsfNANFGVYS RFFANQHEFD FEAQGALGSD 
YGYDFAFFRN ALVLKPSVGV SYNHLGSTNF 
YYGDTSYFYL HAGVLQEFAH FGSNDVASLN 
LNLGWYLHN LISNASHFAS NLGMRYSF 
<212> Type : PRT 

20 <211> Length : 1288 

SequenceName : SEQ ID 93 
SequenceDescription : 



PSYSTINTSK VQGEVDFNHL TVGDQNAAQA 3 00 

YKDKPNSTTS QSGTKNDKKE ISQNNNSNTE 360 

VNIFHLNTKA DGTIKVGGFK ASLTTNAAHL 42 0 

GPLRWWQVG GYAIiAGSSAJSr FEFKAGVDTK 480 

DTGWGGFNTL DFSGVTDKVN INKLITASTN 540 

GSQSRINTVR LETGTRSIFS GGVKFKSGEK 600 

STPENPWGTS KLMFNNIiTLG QNAVMDYSQF 660 

GNAAAMMFNN DIDSATGFYK PLIKINSAQD 720 

NLEEQFKERL ALYNNNNRMD TCWRNTDDX 780 

GISKTANGSK ISVYYLGNST PTENGGMTTN 840 

VAINQHDFGT lESVFELAlJR SKDIDTLYTH 90 0 

TGEITKQLNA ATDALNNVAS LEHKQSGLQT 960 

IiQALKGQEFA SLESAAEVLY QFAPKYEKPT 102 0 
LNGNVEAIVG GFGSYGYSSF SNQAMSLWSG ^. 10 80 

QSSLNFKSTL LQDLNQSYNY IiAYSATARAS 1140 

KSNSQSQVAL KNGASSQHLF NANANVEARY 12 00 

TFKINAARSP LSTYARAMMG GELQLAKEVF 1260 

1288 



25 



30 



35 



40 



45 



Sequence 

<213> Organ! stnName : Helicobacter pylori J99 
<400> PreSequenceString : 

MKKHIIiSLTL GSLLVSTLSA EDDGFYTSVG YQIGEAAQMV TNTKGIQDLS 
NRYSTLNTLI KLSADPSAIN AVRENLGASA 
GYVTQCGGNA NGQKSISSKT IFNNEPGYRS 
QILQTALKRG LPALKENNGK VNVTYTYTCS 
KSVTTTISSK WDSRADGNT TGVSYTEITN 
HASNSSEANA PKFSTTTGKI CGAFSEEISA 
ASFAQGMIiAN ASAQAKMLNL 



NGKPFNPFTD 
PSTAGTGGTQ 
VNFKSRYSEIi 
QTINQELGRN 
IKSSFFNSAS 
VNLATMNNVY 
YSFMGAELKY 
<212> Type 
<211> 



DRYESLNNLL 60 

KNLIGDKANS PAYQAVLLAI NAAVGFWNW 12 0 

TSITCSIiNGH SPGYYGPMSI ENFKKLNEAY 180 

GDGNNNCSSQ VTGVNNQKDG TKTKIQTIDG 240 

KLEGVPDSAQ ALLAQASTLI NTINNACPYF 3 00 

IQKMITDAQE LVNQTSVINE HEQTTPVGNN 3 60 

AEQVGQAINP SRLSGTFQNF VKGPIiATCKN 42 0 

GSAPGTVTTQ TFASGCAYVG QTITNLKNSI AHFGTQEQQI QQAENIADTL 480 

GNTYNSITTA LSNIPNAQSL QNAVSKKNNP YSPQGIDTNY YLNQNSYNQI 540 

PFRKVGIVSS QTNNGAMNGI GIQVGYKQFF GQKRKWGARY YGFFDYNHAF 600 

DVWTYGFGAD ALYNFINDKA TNFLGKNNKL SVGLFGGIAL AGTSWLNSEY 660 

NAKMNVANFQ FLFNMGVRMN LARPKKKDSD HAAQHGIELG IiKIPTINTNY 72 0 

RRLYSVYLNY VFAY 744 



PRT 

Length : 744 
SequenceName : SEQ ID 
Sec[uenceDescription : 



94 



Secjuence 



<213> OrganismName : Helicobacte: 
<40 0> PreSequenceString : 

50 MIKKAKKFIP FFLIGSLLAE DNGWYMSVGY 
lAGPTTGLIT LSSQTVIDAL GYGVSNTVGN 
IIGLKGSSDP LKAHSSQITA KLLSNTQSAF 
QNTAQSMAEL LQQIEHSITK TTSTTYAQSL 
GVFPTTTSTH WLNPPGQW FYPTNSLLGS 

55 NPNGCANQIQ CLEQFIQNLT PLAATPTSTN 
YNLNNLHNAL NFQAYQSTIE QYNNALKQIS 
ISAYDCTSAT GSLSSDASSG ISCSATSSTN 
LVSQVWSVYN SLKTSEENLQ KNAKILCNNG 
NGTTTNTQAK SNASKLKAMV MVNNEEEAKT 

60 NFQQSIQSAF QNQENNIQAW ANALYNTSNP 
INQQVPTDMN ALINQSQQTQ QTSGSASTTN 
ALGYQTQATT QNGSSGGSNI TYNVQQITLT 
INTAYQMLTD ASDGKLGTYN SSNSSNSSNS 
NATTATTTTD SNLQKVYNDA QKIANIIASS 

65 GSSGSSSTCS GGLINLLGAI PTNGVSDTNM 
QAITSAISQG FQALQNDISP NAILTLLQEI 
IDAMINARNQ VQNAQNQANN YGSQPVLSQY 



: pylori J99 

QIGGTQQFIN NKQLLENQNI INSITQSAIN 60 

QLEGISNILN QIGKRKDFYS SRQISSISQQ 12 0 

DQGIALSSNI ISAVNSLNPS NNSQEVKAQL 180 

LSNLTDAVNA SSNWTTYVSA LVNALNTLGV 24 0 

TSSNSNNQQQ YNNTLLMNTL QGELSTNNQN 3 00 

QANQQVQAIA QKLQSVAINA LDNNAINNTT 360 

WISFSEPKNL LKNTSNNYQI GTVTNDQGQN 420 

NTNSFDNSLV ATSKVQTING KEQIGVNSFN 480 

SQSGTSPCNS SSGGLSISGM AQLQNILSPT 540 

TNFNQSSGPT TQSSNSTVMG ALNTVLQNVS 600 

NGNQSQNLTT NNNQDLRIQL RANFYQLINT 660 

NACASGMGSS GNWCYQQWSD SKAYYSGLQS 720 

SGGLLNQIIT NLKSVNGGSN GGSSGNGTSQ 780 

GNNNGYTPCN STNGSNGTSG SNCYEPNKQQ 840 

GNNKGVENGL KQFFEALKSN SSSLSNLCGN 900 

LINLLTEFIK TAGFIQNKDS NVSTSLTSAF 960 

TSNTTTIQSF SQTLRQLLGD KTFFMVQQKL 1020 

AAAKSTQHGM SNGIiGVGIGY KYFFGKARKL 1080 
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GLRHYFFFDY GFSEIGLANTQ SVKANIFAYG VGTDFLWNLF RRTYNTKALN FGLFAGVQLG 1140 
GATWLiSSLRQ QIIDNWGNAN DIHSTNFQVA LNPGVRTNFA EFKRFAKKFH NQGVISQKSV 1200 
EFGIKVPLIN QAYIiNSAGAD VSYRRLYTPY INYIMGF 1237 
<212> Type : PRT 
5 <211> Length : 1237 

SequenceName : SEQ ID 95 

SeqtuenceDescription : 

Sequence 

10 

<213> OrganisinName : Helicobacter pylori J99 

<40 0> PreSequenceString : 

MKQNLKPFKM IKENLMTQSQ KVRPIiAPLSL ALSLSFNPVG AEEDGGFMTF GYELGQWQQ 60 
VKNPGKIKAE ELAGLLNSTT TNNTNIMIAC5 TGGNVAGTLG NIjS"iyn^7QIiGNIi IDLYPTLKTN _.^120 

15 NLKQCGSTITS GNGATAAAAT NNSPCFQGNL ALYNEMVDSI KTLSQNISKN IFQGDNNTTS 18 0 

ANLSNQLSEL NTASVYLTYM NSFLNANNQA GGIFQNNTNQ AYENGVTAQQ lAYVIiKQASI 240 
TMGPSGDSGA AGAFLDAALA QHVFNSANAG ITOLSAKEFTS LVQNIVIJITSQ NALTLANNAN 3 00 

ISNSTGYQVS YGGNIDQARS TQLLNNTTNT IiAKVTALNNE LKANPWLGNF AAGNSSQVNA 3 60 

FNGFITKIGY KQFFGENKNV GLRYYGFFSY NGAGVGNGPT YNQVNLLTYG VGTDVLYWVF 420 

20 SRSFGSRSLN AGFFGGIQLA GDTYISTIiRN SPQLASRPTA TKFQFLFDVG LRMNFGILKK 480 
DLKSHNQHSI EIGVQIPTIY NTYYKAGGAE VKYPRPYSVY WVYGYAF 527 
<212> Type : PRT 
<211> Length : 527 

SequenceName : SEQ ID 96 

25 SequenceDescription : 

Sequence 



<213> OrganismNarae : Helicobacter pylori J99 

30 <400> PreSequenceString : 

MKKTLLLSLS LSFGLHAEDD GFYASAGIRI GEAAQMVKNT KGIQQLSENY EKLNNLLNNY 60 

NTLNTIiVKLS SDPSAVNDAR DNLGSSTRNL LDVKAITSPAY QAVLLALNAA VGLWQVTSYA 12 0 

FTACGPGSME NANGGIQTFN NVPGQNTTTI TCNSYYEPGH GGPISTKNYA IINKAYQIIQ 180 

KALTANGEGI PVLSNTTTKL DFTINGDKRT GGEPNKKLVY PWSHGKAIST SWNATITAPT 240 

35 TENINTTNSA QELLKQASII ITTLNSACPN FQNGGSGYWA GISGNGTMCG MFKNEISAIQ 3 00 

GMIANAQEAV AQAKIVSENT QNQNSLDAGK PFNPYTDASF AESMLKNAQA QAEILNQAEQ 3 60 

WKNFEKIPT AFVNDSLGVC YEVQGGERRG TNPGQTTSNT WGAGCAYVGQ TITNLKNSIA 420 

HFGTQEQQIQ QAENIADTLV NFKSRYSELG NTYNSITTAL SNIPNAQSLQ NAVSKK3SINPY 480 

SPQGIDTNYY LNQNSYNQIQ TINQELGRNP FRKVGIVSSQ TNNGAMNGIG IQVGYKQPFG 540 

40 QKRKWGARYY GFFDYNHAFI KSSFFNSASD WTYGFGADA LYNFINDKAT NFLGKISINKLS 600 

VGLFGGIALA GTSWLNSEYV NLATMNNVYN AKMNVANPQF LFNMGVRMNL ARPKKKDSDH 660 

AAQHGIELGL KIPTINTNYY SPMGAELKYR RLYSVYLNYV FAY 703 
<212> Type : PRT 
<211> Length : 703 

45 SequenceName : SEQ ID 97 

SequenceDescription : 



Sequence 



50 <213> Organ istnName : Helicobacter pylori J99 
<400> PreSequenceString : 

MIKKNRTLFL SLALCASISY AEDDGGPFTV GYQLGQVMQD VQNPGGAKSD ELARELNADV 60 

TNNILNNNTG GNVAGALSNA PSQYLYSLLG AYPTIOiNGND VSANALLSGA VGSGTCAAAG 12 0 

TAGGTTLNTQ SACTAAGYYW LPSLTDRILS TIGSQTNYGT NTNFPNMQQQ LTYLNAGNVF 180 

55 FNAMNKALEK NGTATANSTS STSGATGSDG QTYSQQAIQY LQGQQNILNN AANLLKQDEIi 240 

LLEAFNSAVA ANIGNKEFNS AAFTGLVQGI IDQSQLVYNE LTKNTISGSA VNNAGINSNQ 3 00 

ANAVQGRASQ LPNALYNVQV TLDKINALNN QVRSMPYLPQ PRAGNSRATN ILNGPYTKVG 3 60 

YKQFFGKKRN IGLRYYGFPS YNGASVGFRS TQNNVGLYTY GVGTDVLYNI FSRSYQNRSV 42 0 

DMGFFSGIQL AGETFQSTLR DDPNVKLHGK INNTHFQFLF DFGMRMNPGK LDGKSNRHNQ 480 

60 HTVEFGWVP TIYNTYYKSA GTTVKYFRPY SVYWSYGYSF 520 



<212> Type : PRT 

<211> Length : 520 

SequenceName : SEQ ID 98 
SequenceDescription : 

65 

Sequence 
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<213> Organ istnMaitie : Helicobacter pylori J99 
<400> PreSequenceString : 

MKKKFLSLTL GSLLVSALSA EDNGFPVSAG YQIGESAQMV KNTKGIQDLS DSYERLNNLL 60 

TNYSVLiNALI RQSADPNAIN NARGNIiNASA KNLINDKKNS PAYQAVLLAL NAAAGLWQVM 120 

5 SYAISPCGPG KDTSKNGGVQ TFHNTPSNQW GGTTITCGTT GYEPGPYSIL STENYAKINK 180 

AYQIIQKAFG SSGKDIPALS DTNTELKFTI NKNNGNTNTN NNGEEIVTKN NAQVLLEQAS 240 

TIITTLNSAC PWINNGGAGG ASSGSLWEGI YLKGDGSACG IFKNEISAIQ DMIKNAAIAV 3 00 

EQSKIVAANA QNQRNLDTGK TFNPYKDANF AQSMFANAKA QAEILNRAQA WKDFERIPA 3 60 

EFVKDSLGVC HEVQNGHLRG TPSGTVTDNT WGAGCAYVGE TVTNLKDSIA HFGDQABRIH 420 

10 NARNIiAYTIoA NFSSQYQKLG EHYDSITAAI SSLPDAQSLQ NWSKKTNPM SPQGIQDNYY 480 

IDSNIHSQVQ SRSQELGSNP FRRAGLIAAS TTNNGAMNGI GFQVGYKQFF GKNKRWGARY 540 

YGFVDYNHTY NKSQFFNASS DVWTYGVGSD LLVNFIKDKA TKHNKISFGA FGGIALAGTS 600 

WLNSQYVNIiA NVNNYYKAKI NTANFQFLFN LGLRMNLARK KHRATDNAAQ HGIELGTKIP 660 

TINTNYYSi:iL--GT'rLQYRRLY -SVYLNYVFAY - 69.0 

15 <212> Type : PRT 

<211> Length : 690 

Sec[ueiiceName : SEQ ID 99 
SeopaenceDescription : 



20 Sequence 



<213> OrganismName : Helicobacter pylori J99 
<4 00> PreSequenceString : 

MKIKKSLFAL SFSLMASLSR AEDDGFYMSV GYQIGEAVQK VKNTGALQNL ADRYDNLSNL 60 

25 LNQYNYLNSL VNLASTPSAI TGAIDNLSSS AINLTSATTT SPAYQAVALA LNAAVGMWQV 120 

lAFGISCGPG PNLGPEHLEN GGVRSFDNTP NYSYNTGSGT TTTTCMGASM VGPNGILSSS 180 

EYQVLNTAYQ TIQTALNQNQ GGGMPALNSS KNMWNINQT FTKNPTTEYT YPDGNGNYYS 240 

GGSSIPIQLK ISSVNDAENL LQQAATIINV LTTQNPHVNG GGGAWGFGGK TGNVMDIFGD 3 00 

SFNAINEMIK NAQAVLEKTQ QLNANENTQI TQPDNFNPYT SKDTQFAQEM LNI^ANAQAEI 3 60 

30 LSIiAQQVADN FHSIQGPIQQ DLEECTAGSA GVINDNTYGS GCAFVKETLN SLEQHTAYYG 420 

NQVNQDRALS QTILNFKEAL STLGNDSKAI NSGISNLPNA KSLQNMTHAT QNPNSPEGLL 480 

TYSLDTSKYN QLQTVAQELG KNPFRRIGVI NYQNNNGAMN GIGVQAGYKQ FFGKKRNWGL 540 

RYYGFFDYNH AYIKSNFFNS ASDWTYGVG MDALYNFIND KNTNFLGKNN KLSVGIiFGGF 60 0 

ALAGTSWLMS QQVNLTMMNG lYNANVSASN FQFLFDLGLR MNLARPKKKD SDHAAQHGME 660 

35 LGVKIPTINT DYYSFMGAEL KYRRLYSVYL NYVFAY 696 



<212> Type PRT 

<211> Length : 696 

SequenceNarae : SEQ ID 100 
SequenceDescription : 

40 

Sequence 



<213> OrganismName : Helicobacter pylori J99 
<400> PreSequenceString : 

45 MKIKKSLFAL SFSLMASLSR AEDDGFYMSV GYQIGEAVQK VKNTGALQNL ADRYDNLSNL 60 

LNQYNYLNSL VNLASTPSAI TGAIDNLSSS AINLTSATTT SPAYQAVALA LNAAVGMWQV 12 0 

lAFGISCGPG PNLGPEHLEN GGVRSFDNTP NYSYNTGSGT TTTTCNGASN VGPNGILSSS 180 

EYQVLNTAYQ TIQTALNQNQ GGGMPALNSS KNMWNINQT FTKNPTTEYT YPDGNGNYYS 240 

GGSSIPIQLK ISSVNDAENL LQQAATIINV LTTQNPHVNG GGGAWGFGGK TGNVMDIFGD 30 0 

50 SFNAINEMIK NAQAVLEKTQ QLNANENTQI TQPDNFNPYT SKDTQFAQEM LNRANAQAEI 360 

LSLAQQVADN FHSIQGPIQQ DLEECTAGSA GVINDNTYGS GCAFVKETLN SLEQHTAYYG 42 0 

NQVNQDRALS QTILNFKEAL STLGNDSKAI NSGISNLPNA KSLQNMTHAT QNPNSPEGLL 4 80 

TYSLDTSKYN QLQTVAQELG KNPFRRIGVI NYQN1SINGAMN GIGVQAGYKQ FFGKKRNWGL 54 0 

RYYGFFDYNH AYIKSNFFNS ASDVWTYGVG MDALYNFIND KNTNFLGKNN KLSVGLPGGF 600 

55 ALAGTSWLNS QQVNLTMMNG lYNANVSASN FQFLFDLGLR MNLARPKKKD SDHAAQHGME 660 

LGVKIPTINT DYYSFMGAEL KYRRLYSVYL NYVFAY 696 
<212> Type : PRT 
<211> Length : 696 

SequenceName : SEQ ID 101 



60 SequenceDescription : 

Sequence 



<213> OrganismName : Helicobacter pylori J99 
65 <400> PreSequenceString : 

MHKKVLLALT ASLICQESLF AKDKDYTLGK VSTAGKKDRS DYSGQVNLGY SGITAPKSWQ 60 
DEEVKKYTGS RTVISNKALT QQANQSIEEA LQNVP6LQIR NATGVGAMPT IQIRGFGAGG 120 
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SGHSDATLML VNGIPVYMAP YAHIELiDIFP VTFQAIDRID VIKGGGSVQY GPNTYGGIVN 180 

IITKPIPNQW ENQAAERITY WAKARNAGFA APPDKTGDPS FIKSLGNNLL YNTYVRSGGM 240 

INKHVGIQAQ ANWVRGQGFR DNSPSSISNY WLDGVYDINE SNGIKAYYQY YDFAIAQPGS 3 00 

LSEQDYKINR FANLRPLNQK GGRSQRFGAV YENRFGDLDR VGGTFSFTYY GQLMTRDFQV 3 60 

5 SSSYNSANMV TCFSEAACRA AGLPAGYNLA VPYYATNYNG WAEVENPVRS INNAFEPKVN 420 

LIVNTGKVRQ TFIMGLRFMT TTFLQRQYLN TNECATKTSG EGAGFLCEGP NVMSGWKPHI 48 0 

KHGVYRNWNN WRJSINYTAVYL SDRIEAWDGR FFIVPGLRYA FVQYNNENAS NWMQIPEKDL 540 

RKIKHMNNWM PSTNIGFIPV QGDHNVLTYF NYQRSFVPPQ LDVLSYGGAE YFTQHFDTVE 600 

AGARYTYKDK FSFNADYFRI WARDFATGQY SVYTSGPMKG NVRPINGYSQ GVELELYYRP 660 

10 IRGLQFHAAF NYIDTRVTSH GPLTDLNGDV LKGTSYNKHF PFVSPFQFIF DARYNWRKTT 720 

IGISSYFYSR AYSGISNSAA GGYYGMQYYS GGlsTNYESVLN SGYQCEAWCM TQHEGLLPWY 780 

WVWNIQVSQI FWENGRHRVT GSLQIIJNIFN MKYYFTGIGS SPAGLQPAPG RSVTAYLNYT 840 

F 841 
<212> Type : PRT ... , , 

15 <211> Length : 841 

SequenceName : SEQ ID 102 
SequenceDescription : 



Sequence 

20 

<213> OrganismName : Helicobacter pylori J99 
<400> PreSequenceString : 

MKKTLLLSLS ASSLLNAEDN GFFISAGYQI GEAAQIWKNT GELKKLSDTY ENLSNLLTMF 60 
NNDNQAVTMA SSPSEINAAI DNLKANTQGL IGEKTNSPAY QAVYLALNAA VGLWNVIAYN 120 

25 VQCGPGNSGQ QSVTFEGQPG HNSSSINCNL TGYNNGVSGP LSIENFKKLN QAYQTIQQAL 180 
KQDSGFPVLD SAGKQVTITI TTQTNGANKS ETTTTTTTTN DAQTLLQEAS KMISVLTTNC 240 
PWVNHNQGQN GGAPWGLDTA GNVCQVFATE FSAVTSMIKN AQEIVTQAQS LNQQNNQNAP 3 00 

QDFNPYTSAD RAFAQNMLNH AQAQAKILEL ADQMKKDLNT IPSQFITNYL AACHNGGGTL 3 60 

PDAGVTNNTW GAGCAYVEET ITALNWSLAH FGTQAEQIKQ SELLARTILD FRGSLSNLNN 420 

30 TYNSITTTAS NTPNSPFLKN LISQSTNPNN PGGLQAVYQV HQSAYSQLLS ATQELGHNPF 480 
RRVGIilSSQT NNGAMNGIGV QVGYKQFFGE KRRWGLRYYG FFDYNHAYIK SSFFNSASDV 540 
FTYGVGTDVXi YNFINDKTTK NSKISFGVFG GIALAGTSWL NSQYVNLATF NNFYSAKMNV 600 
ANFQFLFNLG LRMNIAKNKK KASDH2^QHG VELGVKXPTI NTNYYSLLGT QLQYRRIiYSV 660 
YLNYVFAY 668 

35 <212> Type : PRT 

<211> Length : 668 

SequenceName : SEQ ID 103 
SequenceDescription : 

40 Sequence 



<213> OrganismName : Helicobacter pylori J99 
<40 0> PreSequenceString : 

MRKLFIPLIili FSALEANEKN GFFIEAGFET GLLEGTQTQE KRHTTTKNTY ATYNYLPTDT 60 

45 ILKRAANLFT NAEAISKLKF SSLSPVRVLY MYNGQLTIEN FLPYNLNNVK LSFTDAQGNT 120 

IDLGVIETIP KHSKIVLPGE AFDSLKEAFD KIDPYTLPLP KFEATSTSIS DTNTQRVFET 18 0 

LNNIKTNLIM KYSNENPNNF NTCPYNNNGM TKNDCWQNFT PQTAEEFTNL MLNMIAVLDS 24 0 

QSWGDAILNA PFEFTNSSTD CDSDPSKCVN PGVNGRVDTK VDQQYILNKQ GIINNFRKKI 3 00 

EIDAWLKNS GWGLANGYG NDGEYGTLGV EAYALDPKKL FGNDLKTINL EDLRTILHEF 360 

50 SHTKGYGHNG NMTYQRVPVT KDGQVEKDSN GKPKDSDGLP YNVCSLYGGS NQPAFPSNYP 420 

NSIYHNCADV PAGFLGVTAA VWQQLINQNA LPINYANLGS QTNYNLNASL NTQDLANSML 480 

STIQKTFVTS SVTNHHFSNA SQSFRSPILG VNAKIGYQNY FISTDFIGLAYY GIIKYNYAKA 540 

VNQKVQQLSY GGGIDLLLDF ITTYSNKNSP TGIQTKRNFS SSFGIFGGLR GLYNSYYVLN 600 

KVKGSGNLDV ATGLNYRYKH SKYSVGISIP LIQRKASWS SGGDYTNSFV FNEGASHFKV 660 

55 FFNYGWVF 668 



<212> Type : PRT 

<211> Length : 668 

SequenceName : SEQ ID 104 
SequenceDescription : 

60 

Sequence 



<213> OrganismName : Helicobacter pylori J99 
<400> PreSequenceString : 
65 MNKTTIKILM GMALLSSLQA AEAELDEKSK KPKFADRNTF YLGVGYQLSA INTSFSTSSI 60 
DKSYFMTGNG FGWLGGKFV AKTQAVEHVG FRYGLFYDQT FSSHKSYIST YGLEFSGLWD 120 
AFNSPKMFLG LEFGLGIAGA TYMPGGAMHG IIAQYLGKEN SLFQLLVKVG FRPGPFHNEI 180 
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TFGLBCFPVIP NKKTEIVDGL SATTLWQRLp VAYFNYIYNF 220 
<212> Type : PRT 
<211> Length : 220 

SequenceName : SEQ ID 105 
5 . SequenceDescription : 

Sequence 



<213> OrganismName : Helicobacter pylori J99 

10 <400> PreSequenceString : 

MKKTKKTILLi SLTLAASLLH AEDNGVFLSV GYQIGEAVQK VKNADKVQKL SDVYEQLSKL 6 0 

liANDNGTSSK TSAQAINQAV NNLNESAKTL AGGTTNSPAY QATLLALRSA LGLWNSMGYA 12 0 

WCGGYIKKP GENNQKNFHY TDEMGNGTTI NCGGSTNSNG THSPNGTNTL KADKNVSLSI 18 0 
' EQYEICIHEAY QILSKALKQA GIiAPLNSKGE KLEAtlVTTSK DQQGTSSDQT ^XI'TTSVIDTT . ^-.24 0 

15 NDAQNLLTQA QTIVNTLKDY CPMLIAKSSS NGGTNGANTP SWQTAGGGKN SCATFGAEFS 3 00 

AISDMISNAQ KIVQETQQLN ANQPKNITQP NNFNLNSPGS LTALAQSMLK NAQSQTEILK 3 60 

IiANQVASDFD KLSSGYLKDY IGKCDVSGVS SSNMTPQNMN TTWGKGCAGV EETLTSLKAS 42 0 

TTDFNNQTTP QLDQAQTIiAlJl TLTQELGNNP FKRVGIIGSQ TNNGAMNGLG VQAGYKQFFG 480 

QKRRWGLRYY GFFDYNHTYI KSSFFNSSSD VIiTYGVGSDL LFNFINDKNT MFLGKNNKIS 540 

20 VGLFGGIALA GTSWLNSQPV NLKTISNVYS AKVNTANFQF LFNLGIiRTNL ARPKKKDSDH 600 

SAQHGMELGV KIPTINTNYY SYLGTKLeYR RLYSVYIiNYV FAY 643 
<212> Type : PRT 
<211> Length : 643 

SecjuenceNarae : SEQ ID 106 

25 SequenceDescription : 

Sequence 



<213> Organ isTtiName : Helicobacter pylori J99 

30 <400> PreSequenceString : 

MKKTILLSLM VSSLFAENDG VYMSVGYQIG EAAQMVKNTG EIQKVSNAYE NLNNLLTRYN 60 

ELKQTASNTD SSTAQAIDNL EKSASRLKTT PNTANQAVSS ALSSAVGMWQ VIASNLAITNS 12 0 

LSSSEYKKLK ATSQLLQNTL ElsTKNOTSTLKIE NDYDQLLTQA STIINTLQSQ CPGVDGGMGK 180 

PWGINTSGNA CAIFGSTFNA UsTSMIDSAKK AAADARRTAP ESPNQQNAFT NADFJSIKNLNQ 24 0 

35 VSSVINDTIS YLKGDNLETI YNTIQKTPNS KGFQSLVSRS SYSYSLNETQ YSQFQTTTKE 3 00 

FGHNPFRSVG LINSQSNNGA MNGVGVQLGY KQFFGKNKFF GIRYYGFFDY NYAYIKSMFF 3 60 

NSASNVFTYG AGSDLLLNFI NGGSDRNRKV SFGIFGGIAL AGTTWLJTMQS ANLKITNSAY 420 

SAKINNTNFQ FLFNTGLRLQ GIHHGIELGV KIPTINTNYY SFMGAKLAYR RLYSLYLNYV 480 

LAY 483 

40 <212> Type : PRT 

<211> Length : 483 

SequenceName : SEQ ID 107 
SequenceDescription : 

45 Sequence 



<213> Organ isttiNarae : Helicobacter pylori J99 
<400> PreSequenceString : 

MPKASQVLFF GAFLSTSLQG FEAKLNGFVD QSSTIGFMQH KINKERGIYP MQQFATIAGY 60 

50 LGLGFSLLPK KVSDHVLKGK IGGMVGSIFY DGTKKFEDGS VAYNLFGYYD GFMGVYTNIL 120 

QTDSLETQNM KHNKNVRNYV FSDAYLEYAY KNYFEIKAGR YLSTMPYKSG QTQGFQVSGQ 180 

YKHARLTWFS SWGRAFAYGS FLMDWFAART TYSGGFTKNN NGGYDSHGRK VLYGTHAVQL 24 0 

TYKPHRFLIE GFYYLSPQIF NAPGVKIGWD SNPNFSGTGF RSDTAIIGFF PIYYPWMIVK 300 

SNGSPVYRYD TPATQNGQNL IIRQRFDINN YNVSIAFYKV FQNANGWIGN MGNPSGVIMG 360 

55 SNSVYAGFTG TALKRDAATI FLSCGGTHFA KKFTWKFATQ YSNSWSWEA RAMISLGYKF 420 

TEYLSGSVDL AYYGVHTNKG FKPGENGPVP KNFPALYSDR SALYTALVAS F 471 



<212> Type : PRT 
<211> Length : 471 
60 SequenceName : SEQ ID 108 

SequenceDescription : 

Sequence 



65 <213> OrganismName : Helicobacter pylori J99 
<400> PreSequenceString : 

MLRLVSKTIC LSLISLFNPL EAFQKHQKDV FFVEAGFETG LLBGAQTKEQ AIAQNTQNTQ 60 
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KIYBNPLTHP QTKEQPKEQN KSDTATPQSV YGRYYILQNT ILEKATELFT AANINGNGLT 120 

FYSQNPVYVM AYNKDNAEFE GYGNNSWVI QNFLPYmNN lELSYTDAQG KAVNLGVIET 180 

IPKDSQIIIiP ASLFNNFSND SPFNSDGLQQ LQTTTTPFSD ANTQSLFEKL SQITTNLQMT 240 

YENTDPPSSG NNDPNGPLAS PKPHYECPGY KKSCQVASVS FTPQTAEELT NliMLDMIAVF 3 00 

5 DSKSWEEAVL NAPFQFSNSP SECGIDYPKC VNPFNNGLVD PKDEKYALTP EEVINSYRVA 360 

NELTVNLIiHA AKGFLGLGSQ LGSANAPDDD GFNQGVLGIA PFALDPEKLF GKNLNKVAIL 420 

ALRDIIHEYG HTLGYTHNGN MTYQRVRLCQ EGNGPEARCE GGHEVEKKTGK EELEFSNGHE 480 

VRDHDGYTYD VCSRFGGKNQ PAPPSNYPNS XYTNCAQVPA GLXGVTTAVW QQLINQNALP 540 

INFANLITSQT SHLNAGLNAQ NFATSMVSAI AQNFSTTSTT TYRSSSKNFR SPILGVNVKI 600 

10 GYQHYFNDYI GLAYYGIIQY NYAQANDEKI QQLSYGGGMD VLFDFITTYT NKKQDHPTKK 660 

VFASSFGVFG GLRGLYNSYY VFNQVKGSGN LDIVTGFNYR YKHSKYSIGV SVPLIQSGIK 720 

lASNNGIYAD SWLNEGGSH PKVFFNYGWV F ' 751 

<212> Type : PRT 

<211> Length : 751 ■ . ... 

15 SequenceName : SEQ ID 109 

SequenceDescription : 



Sequence 



20 <213> OrganistnName : Helicobacter pylori iJ99 
<400> PreSequenceString : 

MQNFVFNKKW LIYSSLLPLF FLNPLMAEDD GFFMGVSYQT SIiAVQRVDNS GLNASQDAST 60 

YIRQNAIALE SAAVPLAYYL EAMGQQTRVL MQMLCPDPSK RCLLYAGGYQ NGQNNNGDTG 120 

NNPPRGNVNA TFDMQSLVNN LNKLTQIjIGE TLIRNPENLP NSKVFNVKFG NQSTVIALPE 180 

25 GLANTiVIDALN NDITNALTTL WYNQTLTNKS FSTPSNTSVN FSPQVLQHLIi QDGLATANNN 24 0 

QTICSTQNQC TATNEAKSIA QNAQNIFQAL MQAGILGGLA NEKQFGFTYN KAPNGSDSQQ 3 00 

GYQSFSGPGY YTKNDNTTQA PLKALPAGAT IGSGNGQYTY HPSSAVYYLA DSIXANGITA 3 60 

SMIFSGMQMF ANKAAKLIGT SSYNQMQDAI NYGESLLSNT VAYGDFITNW VAPYLDLNNK 42 0 

GLNFLPNYGG QLMGANNQTP QLTPQQAQQE QKVIMNQLEQ ATNAPTPAQI NRIIiANPYSP 480 

30 TAKTLMAYGL YRSKAVIGGV IDEMQTKVNQ VYQMGFARNF LEHNSNSNNM NGFGVKMGYK 540 

QFPGKKRMFG LRYYGFYDFG YAQFGTESSIi VKATLSSYGA GTDFLYNVFT RKRGTEAIDI 600 

GFFAGIQLAG QTWKTNFIjDQ VDGNHLKPKD TSFQFLFDLG IRTNFSKIAH QKRSRFSQGI 660 

EFGIiKIPVLY HTYYQSEGVT AKYRRDFSFY VGYNIGF 697 
<212> Type : PRT 

35 <:211> Length : 697 

SequenceName : SEQ ID 110 
SequenceDescription : 



Sequence 

40 

<213> OrganisTtiName : Helicobacter pylori J99 
<400> PreSequenceString : 

MKKTXLLSLS LSLASSLLHA EDNGFFVSAG YQIGEAVQMV KNTGELKNLN EKYEQLSQYL 60 

NQVASLKQSI QNANNIELVN SSLNYLKSFT NNNYNSTTQS PIFNAVQAVI TSVLGFWSLY 120 

45 AGNYLTFFW NKDTQKPASV QGNPPFSTIV QNCSGIENCA MNQTTYDKMK KLAEDLQAAQ 18 0 

QNATTKANNL CALSGCATTQ GQNPSSTVSN ALNLAQQLMD LIANTKTAMM WKNIVIAGVS 240 

NVSGAIDSTG YPTQYAVFNN IKAMIPILQQ AVTLSQSNHT LSASLQAQAT GSQTNPKFAK 3 00 

DIYAFAQNQK QVISYAQDIF MLFSSIPKDQ YRYLEKAYLK IPNAGKTPTN PYRQEVNLNQ 360 

EIQTIQNNVS YYGNRVDAAL SVAKDVYNLK SNQTEIVTTY NNAKNLSQEI SKLPYNQVNT 42 0 

50 KDIITLPYDQ NAPAAGQYNY QIMPEQQSNL SQALAAMSNN PFKKVGMISS QNNNGALNGL 480 

GVQVGYKQFF GESKRWGLRY YGFFDYNHGY IKSSFFNSSS DIWTYGGGSD LLVNFINDSI 54 0 

TRKNNKLSVG LFGGIQLAGX TWLNSQYMNL TAFNNPYSAK VNASNFQFLF NLGLRTNLAT 600 

AKKKDSERSA QHGVELGIKI PTINTNYYSF LGTKLEYRRL YSVYLNYVFA Y 651 

55 <212> Type : PRT 

<211> Length : 651 

SequenceName : SEQ ID 111 • 
SequenceDescription : 

60 Sequence 



<213> OrganismName : Helicobacter pylori J99 
<400> PreSequenceString : 

MLKLASKTIC LSLISSFTAV EAFQKHQKDG FFIEAGFETG LLQGTQTQEQ TIATTQEKPK 60 

65 PKPKPKPITP QSTYGKYYIS QSTILKNATE LFAEDNITNL TFYSQNPVYV TAYNQESABB 12 0 

AGYGNNSLIM IQNFLPYNLN NIELSYTDDQ GNWSLGVIE TIPKQSQIIL PASLFNDPQL 180 

NADGFQQLQT NTTRFSDAST QNLFNKLSKV TTNLQMTYIN YNQFSSGNGS GSKPPCPPYE 240 
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NQiVMCVAKVP PFTSQDAKNL TNIlMLNM^4AV FDSKSWEDAV LNAPFQFSDN NLSAPCYSDY 3 00 

LTCVNPYNDG LVDPKLIAKN KGDEYNIENG QTGSVILTPQ DVIYSYRVAN NIYVNLLPTR 3 60 

GGDLGLGSQY GGPNGPGDDG TNFGALGILS PFLDPEILFG KELNKVAIMQ LRDIIHEYGH 42 0 

TLGYTHNGNM TYQRVRMCEE NNGPEERCQG GRIEQVDGKE VQVFDNGHEV RDTDGSTYDV 480 

5 CSRFKDKPYT AGSYPNSIYT DCSQVPAGLI GVTSAVWQQL IDQNALPVDF TNIiSSQTNYL 540 

NASLNTQDFA TTMIiSAISQS LSSSKSSATT YRTSKTSRPF GAPLLGVNLK MGYQKYFNDY 60 0 

LGLSSYGIIK YNYAQANNEK IQQLSYGVGM DVLFDFITNY TNEKNPKSNL TKKVFTSSLG 660 

VFGGLRGLYN SYYLLNQYKG SGNLNVTGGL NYRYKHSKYS IGISVPLVQL KSRIVSSDGA 72 0 

YTNSITLNEG GSHFKVPFNY GWIP 744 
10 <212> Type : PRT 

<211> Length : 744 

SequenceName : SEQ ID 112 

SequenceDescription : 



15 Sequence 



<213> OrganismName : Helicobacter pylori J99 
<400> PreSequenceString : 

MRKLFIPLLIi FSALEANEKN GFFIEAGFET GLLEGTQTQE KRHTTTKNTY ATYNYLPTDT 60 

20 ILKRAANLFT NAEAISKLKF SSLSPVRVLY MYNGQLTIEN FLPYNLNNVK LSFTDAQGNV 12 0 

IDLGVIETIP KHSKIVLPGE APDSLKIDPY TLFLPKIEAT STSISDANTQ RVFETLNKIK 18 0 

TNLWNYRNE NKFKDHENHW EAFTPQTAEE FTNLMLNMIA VLDSQSWGDA ILNAPFEFTN 240 

SPTDCDMDPS KCVNPGTNGL VNSKVDQKYV LNKQDIWKF KMKADLDVIV LKDSGWGLG 3 00 

SDITPSNNDD GKHYGQLGW ASALDPKKLF GDNLKTINLE DLRTILHEFS HTKGYGHNGN 3 60 

25 MTYQRVPVTK DGQVEKDSNG KPKDSDGLPY NVCSLYGGSN QPAFPSITYPN SIYHNCADVP 420 

AGFLGVTAAV WQQLINQNAL PINYANLGSQ TNYNLNASLN TQDLANSMLS TIQKTFVTSS 480 

VTNHHFSNAS QSFRSPILGV NAKIGYQNYF NDFIGLAYYG IIKYNYAKAV NQKVQQXjSYG 540 

GGIDLLLDFI TTYSNKNSPT GIQTKRNFSS SFGIFGGLRG LYNSYYVLNK VKGSGNBDVA 600 

TGLNYRYKHS KYSVGISIPL IQRKASWSS GGDYTNSFVF NEGASHFKVF FNYGWVF 657 



30 

<212> Type^ : PRT 

<211> Length : 657 

SequenceKTame : SEQ ID 113 
SequenceDescription t 

35 

Sequence 



<213> OrganismKTame r Helicobacter pylori J99 
<400> PreSequenceString : 

40 MSLATSYNVS NNFSKFNIKR VRGYLICLVC MTPKMIQRGL NGVSFYGCSD YVNKGDCKGV 60 
LREINGSMKM VCLHCENTPI MEKVESGRGG AYACKNCNRK FYFIDLAKQN ERKKDLEKEK 12 0 

KELLNKIEKQ KIKHLERPIL AGVKANIKEN SFFLGCKKYP KCEWTASMDS QDLKCPKCNR 180 
LMKRKKNFKN NEFFTATSLT LNAIEFCLYI NLKKKETNV 219 
<212> Type : PRT 

45 <211> Length : 219 

SetjuenceName : SEQ ID 114 
SequenceDescription : 

Sequence 

50 

<213> OrganismName : Helicobacter pylori iJ99 
<400> PreSequenceString : 

MEIKKYFLYA LFFLLFSGLF LSKLQAYKFN MSIVGKVSSY TKFGFNNQRY QPSKDIYPTG 60 
SYTSLLGELN LSMGLYKGLR AEVGAMMAAL PYDSTAYQGN NIPNGQPGSR TDPFGAGIFW 120 

55 QYIGWYAGHS GLNVQKPRIiA MVHNAFLSYN YKKDKFSFGV KGGRYDAEEY DWFTSYTQGV 180 
EGFVKYKDTR LRVMYSDARA SASSDWFWYF GRYYTSGKAL MIADLKYEKD NLKINPYFYA 240 
IFQRMYAPGI NITYDTNPNF NNKGFRFVGT FVGFFPIFAT PANQNDIILF QQVPLGKSGQ 3 00 

TYFFRTRFYY NKWQFGGSVY KNIGNANGDI GIYGDPLGYN IWTNSIYDAE INNIVGADVI 3 60 

NGFLYVGSQY RGFSWKILGR WTDSPRADER SLALFLSYFS NKYNIRMDLK LEYYGNITKK 420 

60 GYCIGYCGMY VPVDPNGPGT QPLTHNVYSD RSHIMFNITY GFRIY 465 
<212> Type : PRT 
<211> Length : 465 

SequenceKTame : SEQ ID 115 
SequenceDescription : 

65 

Sequence 
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<213> OrganismName : Helicobacter pylori J99 
<400> PreSequenc est ring : 

MKKTILLSLS LSLASSLLHA EDNGFFVSAG YQIGEAVQMV KNTGELKNLN DKYEQLSQSL 60 
AQLASLKKSI QTANNIQAVN NALSDLKSFA SNNHTNKETS PIYNTAQAVI TSVIAFWSLY 120 
5 AGNALSFHVT GLNDGSNSPL GRIHRDGNCT GLQQCFMSKE TYDKMKTLAE NLQKAQGNLC 180 
ALSECSSNQS NGGKTSMTTA LQTAQQLMDL lEQTKVSMVW KNIVIAGVTN KPNGAGAITS 240 
TGHVTDYAVF NNIKAMLPIL QQALTLSQSN HTLSTQLQAR AMGSQTNREF AKDIYALAQN 3 00 

QKQILSNASS IFNLFNSIPK DQLKYLENAY LKVPHLGKTP TNPYRQNWL NKEINAVQDN 360 
VANYGNRLDS ALSVAKDVYN LKSNQTEIVT TYNDAKNLSE EISKLPYNQV NVTNIVMSPK 42 0 

10 DSTAGQYQIN PEQQSNLNQA LAAMSNNPFK KVGMISSQNN NGAIiNGLGVQ VGYKQFFGES 48 0 

KRWGLRYYGF FDYNHGYIKS SFFNSSSDIW TYGGGSDLLV NFINDSITRK NNKLSVGIjFG 540 
GIQLAGTTWL NSQYMNLTAF NNPYSAKVNA SNFQFLFNLG LRTNLATAKK KDSERSAQHG 600 
VELGIKIPTI NTNYYSFLGT KLEYRRLYSV YLNYVPAY 638 
<212> Type : PRT ■■ - 

15 <211> Length : 638 

SequenceName : SEQ ID 116 
SequenceDescription : 



Sequence 

20 

<213> OrganismName : Helicobacter pylori J99 
<400> PreSequenceString : 

MKLKKRKVAA TLLKRLTLPL LFTTGSLGAV TYEVHGDFIN FSKVGFNRSP INPVKGIYPT 60 
ETFVNIiTGKL EGSVHLGRGW TVNVGGVLGG QVYDNTRYDR WAKDFTPPSY WDKTSCGTDS 12 0 

25 LSLCMNATKM WQQQGPGGII DPRGIGYMYM GEWNGLFPNY YPANAYLPGH SRRYEVYKAN 18 0 

LTYDSDRVHM VMGRFDVTEQ EQMDWIYQLF QGFYGTFKLT KNMKFIiLFSS WGRGIADGQW 240 
LFPIYREKPW GIHKAGIIYR PTKNLMIHPY VYLIPMVGTL PGAKIEYDTN PEFSGRGIRN 3 00 

KTTFYVLYDY RWNNAEYGRY APARYNTWDP FLDMGKWRGL QGPGGATLYL HHHIDINNYF 3 60 

WGGAYLNIG NPNMNLGTWG NPVALDGIEQ WVGGIYSLGF AGIDNITDAD AFTEYVKGGG 420 

30 KHGKFSWSVY QRFTTAPRAL EYGIGMYLDY QFSKHVKAGXi KLVWLEFQIR AGYNPGTGFL 480 
GPNGQPLNLlsr NGLFESSAFA QGPQNMGGIA KSITQDRSHL MTHISYSP 528 
<212> Type : PRT 
<211> Length. : 528 

SequenceName : SEQ ID 117 

35 Sec[uenceDescription : 

Sequence 



<213> OrganismName : Helicobacter pylori J99 

40 <400> PreSequenceString : 

MKNFSPLYCL KKLKKRHLIA LSLPLLSYAN GFKIQEQSLN GTALGSAYVA GARGADASFY 60 

NPANMGFTND WGENRSEFEM TTTVINIPAF SFKVPTTNQG LYSVTSLEID KSQQNILGII 120 

NTIGLGNILK ALGNTAATNG LSQAINRVQG LMNLTNQKW TLASKPDTQI VNGWTGTTNF 180 

VLPKFFYKTR THNGFTFGGS FTAPSGLGMK WNGKGGEFLH DVFIMMVEIiA PSMSYTINKR 24 0 

45 FSVGVGLRGL YATGSFNNTV YVPLEGASVL SAEQILNLPN NVFADQVPSN MMTLLGNIGY 3 00 

QPALNCQKAG GDMSDQSCQE FYNGLKKIMG YSGLIKASAN LYGTTQWQK SNGQGVSGGY 3 60 

RVGSSLRVFD HGMFSWYNS SVTFNMKGGL VAITELGPSL GSVLTKGSLN INVSLPQTLS 420 

LAYAHQFFKD RLRVEGVFER TFWSQGNKFL VTPDFANATY KGLSGTVASL DSETLKKMVG 4 80 

IiANFKSVMNM GAGWRDTNTF RLGVTYMGKS LRLMGAIDYD QAPSPQDAIG IPDSNGYTVA 540 

50 FGTKYNFRGF DLGVAGSFTF KSNRSSLYQS PTIGQLRIFS ASLGYRW 587 



<212> T;vTS • PI^I* 

<211> Length : 587 

SequenceName : SEQ ID 118 
SequenceDescription : 

55 

Sequence 



<213> OrganismName : Helicobacter pylori J99 
<4 00> PreSequenceString : 

60 MAFQVNTNIN AMNAHVQSAL TQNALKTSLE RLSSGLRINK AADDASGMTV ADSLRSQASS 60 

LGQAIANTND GMGIIQVADK AMDEQLKILD TVKVKATQAA QDGQTTESRK AIQSDIVRLI 12 0 

QGLDNIGNTT TYNGQALLSG QFTNKEFQVG AYSNQSIKAS IGSTTSDKIG QVRIATGALI 180 

TASGDISLTF KQVDGVNDVT LESVKVSSSA GTGIGVLAEV INKNSNRTGV KAYASVITTS 240 

DVAVQSGSLS NLTLNGIHLG NIADIKKNDS DGRLVAAINA VTSETGVEAY TDQKGRLNLR 3 00 

65 SIDGRGIEIK TDSVSNGPSA LTMVNGGQDL TKGSTNYGRL SLTRLDAKSI NWSASDSQH 360 

LGFTAI6FGE SQVABTTVNL RDVTGNFNAN VKSASGANYN AVIASGNQSL GSGVTTLRGA 420 

MWIDIAESA MKMLDKVRSD LGSVQNQMIS TVNNISITQV NVKAAESQIR DVDFAEESAN 480 
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FNKNNILAQS GSYAMSQANT VQQNILRLLT 510 
<212> Type : PRT 
<211> Length : 510 

SequenceName : SEQ ID 119 
5 SequenceDe script ion : 



Sec[uence 



<213> OrganismName : Helicobacter pylori J99 

10 <400> PreSequenceString : 

MAGTQAIYES SSAGFLSQVS SIISSTSGVA GPFAGIVAGA MTAAIIPIW GFTNPQMTAI 60 

MTQYNQSIAE AVSVPMKAAN QQYNQLYQGF NDQSMAVGNN ILNISKLTGE FNAQGNTQSA 12 0 

QISAVNSQIA SIIiASNTTPK NPSAIEAYAT NQIAVPSVPT TVEMMSGILG NITSAAPKYA 18 0 

IiALQEQLRSQ ASNSSMNDTA DSLDSCTALG ALVGSSKVFF SCMQISMTPM SVSMP.TVYAK 24 0 

15 YQAVATKALT SGVNPMTTPA CPIGDICVLAV YCYAEKVAEI LREYYIEFVK NNTNLLQNAS 3 00 

QMILNQSGLA TSTYDTQAIS NISSLYNYNI VANKSPLKSH LTYLDYIKDK LKGQKDSYLT 3 60 

ERVQTKIIVK 370 
<212> Type : PRT 
<211> Length : 370 

20 SequenceName : SEQ ID 120 

SequenceDescription : 



Sec[uence 



25 <213> OrganismName : Helicobacter pylori J99 
<400> PreSequenceString : 

MTNEAINQQP QTEAAFNPQQ FINNLQVAFI KVDNWASFD PNQKPIVDKN DRDNRQAFEK 60 

ISQLREEFAU KAIKNPTKKN QYFSSFISKS NDLIDKDNLI DTGSSIKSFQ KFGTQRYQIF 12 0 

MNWVSHQNDP SKINTQKIRG FMENIIQPPI SDDKEKAEFL RSAKQAFAGI IIGNQIRSDQ 180 

30 KFMGVFDESL KERQEAEKNG EPNGDPTGGD WLDIFLSFVF NKKQSSDLKE TLNQEPVPHV 240 

QPDVATTTTD IQSLPPEARD LLDERGNFSK FTLGDMNMLD VEGVADIDPN YKFNQLLIHN 3 00 

NALSSVLMGS HNGIEPEKVS LLYGNNGGPE ARHDWNATVG YKNQRGDNVA TLINVHMKNG 3 60 

SGLVIAGGEK GINNPSFYLY KEDQLTGSQR ALSQEEIQNK VDFMEFLAQN NAKLDNLSKK 42 0 

EKEKFQNEIE DFQKDSKAYL DALGNDHIAF VSKKDKKHLA LVAEFGNGEL SYTLKDYGKK 480 

35 ADKALDREAK TTLQGSLKHD GVMFVDYSNF KYTNASKSPD KGVGATNGVS HLEAGFSKVA 540 

VFNLPNLNNL AITSWRQDL EDKLIAKGLS PQEANKLVKD FLSSNKELVG KALNFNKAVA 600 

EAKNTGNYDE VKQAQKDLEK SLKKRERLEK DVAKNLESKS GNKNKMEAKS QANSQKDEIF 660 

ALINKEAISTRD ARAIAYAQNL KGIKRELSDK LENINKDLKD FSKSFDEFKN GKNKDFSKAE 720 

ETLKALKGSV KDLGINPEWI SKVENLNAAL NEFKNGKNKD FSKVTQAKSD LENSIKDVII 78 0 

40 NQKITDKVDN LNQAVSVAKA TGDFSGVEQA LADLKNFSKE QLAQQAQKNE DFNTGKNSAL 840 

YQSVKNGVNG TLVGNGLSKA EATTLSKNFS DIKKELNAKL GNFNNNIJNNG LENSTEPIYT 900 

QVAKKVKAKI DRLDQIASGL GDVGQAASFL LKRHDKVDDL SKVGLSANHE PIYATIDDLG 960 

GPFPLKRHDK VDDLSKVGLS REQKLTQKID NLNQAVSEAK ASHFDNLDQM IDKLKDSTKK 102 0 

NWNLYVESA KKVPTSLSAK LDNYATNSHT RINSNVKNGT INEKATGMLT QKNSEWLKXiV 10 80 

45 NDKIVAHNVG SAPLSAYDKI GFNQKMMKDY SDSFKFSTRL SNAVKDIKSG FVQFLTNIFS 1140 

MGSYSLMKAS VEHGVKNTNT KGGFQKS 1167 
<212> Type : PRT 
<211> Length : 1167 

SequenceName : SEQ ID 121 

50 SequenceDescription : 



Secfuence 



<213> OrganismName : Helicobacter pylori J99 

55 <400> PreSequenceString : 

MKTNGHFKDF AWKKCFLGAS WALLVGCSP HIIETNEVAL KLNYHPASEK VQALDEKILL 60 
LRPAFQYSDN lAKEYENKFK NQTTLKVEEI LQNQGYKVIN VDSSDKDDFS FAQKKEGYLA 120 
VAMNGEIVLR PDPKRTIQKK SEPGLLFSTG LDKMEGVLIP AGFVKVTILE PMSGESLDSF 180 
TMDLSELDIQ EKFLKTTHSS HSGGLVSTMV KGTDNSNDAI KSALNKIPAS IMQEMDKKLT 240 

60 QRNLESYQKD AKELKNKRNR 260 
<212> Type : PRT 
<211> Length : 260 

SequenceName : SEQ ID 122 
SequenceDescription : 

65 

Sequence 
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<213> OrganisTuName : Mycoplasma j 
<400> PreSequenceString : 
MKSKLKLKRY LLFI.PLLPLG TLSIiANTYLL 
SYELVDWKRV GDTKLVALVR SALVRVKFQD 
5 GSSDTSGSNS QDFASYVLIF KAAPRA.TWVF 
KTLQDLLVEQ PVTPYTPNAG LiARVNGVAQD 
TGFKLDKGRA YRKLNESWPV YEPLDSTKEG 
AGSASSLQGN GSNSSGLKSL LRSAPVSVPP 
AAVSVmTAS DTATFSKYLN TAQALHQMGV 

10 AAGASQTGLG TGSPREPALT ATSQRAVTW 
AYLNGQIWM GSDRVPSLWY WWGEDQESG 
SNSDSKNSNL KAQGLTQPAY LIAGLDWAD 
WSTTAGLDSD GGYKALVENT AGLNGPINGL 
PVKSDQKATA KIASLINASP LNSYGDDGVT 

15 VNESEIiKSAR ENAQSTSDDN SNTKVKWTNT 
KISTLESQAT DGFANSLLNF GTGLKAGVDP 
DKLLDSTDKN SEPISFSYTP FGSAESAVDL 
QQQVKNRKGY AVITVSRTGI EFNEDANTTT 
FSAVITKDQT WTGKVDIYKM TNGLFEKDDQ 

20 ANSVLQARNL TDKTVDEVIN NPDILQSFFK 
LDGNFYGEDS KIAGIPLNID FPSRIFAGFA 
PMYKVRKLQD SSFVDVFKKV DTIiTTAVGSV 
KPAAPTAPRP PVQPPKKA 
<212> Type : PRT 

25 <211> Length : 1218 

SequenceNarae : SEQ ID 123 
SequenceDescription : 



QDHNTLTPYT PFTTPLNGGL DWRAAHLHP 60 

TTSSDQSNTN QNALSFDTQE SQKALNGSQS 12 0 

ERKIKLALPY VKQESQGSGD QGSNGKGSLY 180 

TVHFGSGQES SWNSQRSQKG LKJSHsTPGPKAV 240 

KGKDESSWKN SEKTTAENDA PLVGMVGSGA 3 00 

SSTSNQTLSL SNPAPVGPQA WSQPAGGAT 3 60 

IVPGLEKWGG NNGTGWASR QDATSTNLPH 42 0 

AGPLRAGNSS ETDALPNVIT QLYHTSTAQL 4 80 

KATWWAKTEL NWGTDKQKQF VENQLGFKDD 540 

HLVFAAFKAG AVGYDMTTDS SASTYNQALA 60 0 

FTLLDTFAYV TPVSGMKGGS QNNEEVQTTY 660 

VFDALGLNFN FJCLNEERIiPS RTDQLLVYGr, 720 

ASHYLPVPYY YSANFPEAGN RRRAEQRNGV 78 0 

APVARGHKPN YSAVLLVRGG WRLNFNPDT 840 

TTLKDVTYIA ESGLWFYTFD NGEKPTYDGK 900 

LSQAPAAIiAV QNGIASSQDD LiTGILPLSDE 960 

LSENVKRRDN GIiVPIYNEGI VDIWGRVDFA 102 0 

FTPAFDNQRA MLVGEKTSDT TLTVKPKIEY 108 0 

ALPSWVIPVS VGSSVGIIiLI LLILGIiGIGI 1140 

YKKIITQTSV IKKAPSALKA ANNAAPKAPV 1200 

1218 



Sequence 

30 

<213> Organ! stnMaTne : Mycoplasma pneumoniae 
<400> PreSequenceString : 

MHQTKKTALS KSTWILILTA TASLATGLTV VGHFTSTTTT LKRQQFSYTR PDEVALRHTN 60 

AXNPRIiTPWT YRNTSFSSIiP LTGENPGAWA LVRDMSAKGI TAGSGSQQTT YDPTRTEAAL 120 

35 TASTTFALRR YDLAGRALYD LDFSKLNPQT PTRDQTGQIT FNPFGGFGIiS GAAPQQWNEV 180 

KtlKVPVEVAQ DPSNPYRFAV LLVPRSWYY EQLQRGLGLP QQRTESGQNT STTGAMFGLK 240 

VKNAEADTAK SNEKLQGAEA TGSSTTSGSG QSTQRGGSSG DTKVKALKIE VKKKSDSEDN 3 00 

GQLQLEKNDL ANAPIKRSER SGQSVQLKM DFGTALSSSG SGGNSNPGSP TPWRPWLATE 3 60 

QIHKDLPKWS ASILILYDAP YARNRTAIDR VDHLDPKAMT ANYPPSWRTP KWNHHGLWDW 42 0 

40 KARDVLLQTT GFFNPRRHPE WFDGGQTVAD NEKTGFDVDN SENTKQGFQK EADSDKSAPI 480 

AliPFEAYFAN IGNLTWFGQA LLVFGGNGHV TKSAHTAPLS IGVFRVRYNA TGTSATVTGW 54 0 

PYALLFSGMV NKQTDGLKDL PFNNNRWFEY VPR^1AVAGAK EVGRELVLAG TITMGDTATV 600 

PRLLYDELES NLNLVAQGQG LLREDLQLFT PYGWANRPDL PIGAWSSSSS SSHNAPYYFH 660 

NNPDWQDRPI QNWDAFIKP WEDKNGKDDA KYIYPYRYSG MWAWQVYNWS NKLTDQPLSA 720 

45 DFVNENAYQP NSLFAAILNP ELIiAALPDKV KYGKENEFAA NEYERFNQKL TVAPTQGTNW 78 0 

SHPSPTLSRF STGFNLVGSV LDQVLDYVPW IGNGYRYGNN HRGVDDITAP QTSAGSSSGI 840 

STNTSGSRSF LPTFSNIGVG LKANVQATLG GSQTMITGGS PRRTLDQANL QLWTGAGWRN 90 0 

DKASSGQSDE NHTKFTSATG MDQQGQSGTS AGNPDSLKQD NISKSGDSLT TQDGNAIDQQ 960 

EATNYTNLPP NLTPTADWPN ALSFTNKNNA QRAQLFLRGL LGSIPVLVNR SGSDSNKFQA 1020 

50 TDQKWSYTDL HSDQTKLNLP AYGEVNGLLN PALVETYFGN TRAGGSGSNT TSSPGIGFKI 108 0 

PEQNraSKAT lilTPGLAWTP QDVGNLWSG TTVSFQLGGW LVTFTDFVKP RAGYLGLQLT 1140 

GLDASDATQR ALIWAPRPWA AFRGSWVNRL GRVESVWDLK GVWADQAQSD SQGSTTTATR 1200 

NALPEHPNAL AFQVSWEAS AYKPNTSSGQ TQSTNSSPYL HLVKPKKVTQ SDKLDDDLKN 1260 

LLDPNQVRTK LRQSFGTDHS TQPQPQSLKT TTPVFGTSSG NLSSVLSGGG AGGGSSGSGQ 132 0 

55 SGVDLSPVEK VSGWIiVGQLP STSDGNTSST NNLAPNTNTG NDWGVGRLS ESNAAKMNDD 13 80 

VDGIVRTPLA ELLDGEGQTA DTGPQSVKFK SPDQIDFNRL FTHPVTDLFD PVTMLVYDQY 1440 

IPLFIDIPAS VNPKMVRLKV LSFDTNEQSL GLRLEFFKPD QDTQPNlSnsrVQ WPNNGDFLP 15 00 

LLTASSQGPQ TLFSPFNQWP DYVLPIjAXTV PIWIVLSVT LGLAIGIPMH KNKQALKAGF 1560 

ALSNQKVDVL TKAVGSVPKE IINRTGISQA PKRLKQTSAA KPGAPRPPVP PKPGAPKPPV 162 0 

60 QPPKKPA 1627 
<212> Type : PRT 
<211> Length : 1627 

SequenceName : SEQ ID 124 
SequenceDescription : 

65 

Sec[uence 
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<213> OrganismNarae : Mycoplasma pneumoniae 
<400> PreSequenceString : 
MGYKLKRWPIi VAFTFTGIGL GWLAACSAL 
KAALAEDNGT ETILRVNFGE ALKSWYQNNK 
5 PINWPIELQK EYDQWGGSES SWKALKLYDR 
KDNLDSTSNK IKFVNSKPND PNGEPFANLQ 
DSLYDQAAIG SALQLGYAFP APREPNNGQS 
QTQQSNTSSR TGLFDWQTKW NTNGAANKLL 
KTSSLPEVKV DSNKSNQNPL DSFFMEGKDA 

10 KQSQQNDTFY QNQRKLSGGQ SGDNNSQGKH 
GGSSGGNSVL IPLPRSAALT HTQQQVQQTT 
KRDFTKQADI LLYRYLQAKS NNFKENGVEF 
MGSGQGTQVK GSVQGSSRAA SVSVQTTQQN 
TDMPTFKKAIj TniQSEYKDY LAAAGKLSEF 

15 SAIGLGQPLP YQRASDGSYP ALEKFFIPED 
TVKQPDIKPT RENNDKKLKQ LTSDVETKAS 
TNMILAIiLSD VGIKWTKILN SFKEWFFTNT 
YLRSWQRLTS KEKFGYYKEL GSVKAQAAQS 
LGKKAFESEL EASSSDGQYK YLRFIiSTLMW 

20 YDDTATASAA AAKAQVAVLK TAQATNTQSD 
DSLLESESTY NFTAEPFDDK TKSQKRSTGG 
QIFNNFGQLV TSSDKSGALS QYKDKATLKR 
RFNSSGEPLI SFDNKKKFLV DWDKLDDVY 
PKPHHSPRTR VSRLWAMSFR LPTRTLTKFL 

25 <212> Type : PRT 

<211> Length : 1300 

SequenceName : SEQ ID 125 
SequenceDescription : 

30 Sequence 



<213> OrganismName Mycoplasma pneumoniae 
<40 0> PreSecfuenceString : 

MKKLIiIKPQF WFLTLGGEXS SSVILVACAT PSNSALQTVF KARSNQFFNG EQGSIiQNAIiA 60 
35 TALKDPEANK QFVAAPLLKA LTAWYENNQD KQVTQFFKDT KKSVDEQYNQ AVDKWSASR 120 

NKNLFVQQDL LDSAGGVRNL KSPEWWTAH 150 

<212> Type : PRT 

<211> Length : 150 

SequencelTarae : SEQ ID 126 
40 SequenceDescription : 

Sequence 



<213> OrganismName : Mycoplasma pneumoniae 
45 <400> PreSequenceString : 

MQQQGETKDQ YNTFGLRLVR NSVGVSVLGL 
KFLKFRAFQA KIGTFYNTNF AFSFPLNETL 
LVDAFSSYKN WLSEYTPVGL ATTMISFYFD 
GLSAKIjPYVN TNGNYEKLNN YFTFLITKVL 

50 KILNNIDSKXj ktfvqklkpt laprpaysnv 
lsfmllkqmf dqnslfkkak tlfemiqnka 

WAKLTDKSIY GNIiKDDKFDD LFKLAFDSSI 
KNFKDLLKAN LKFGEIAFIA YKNTETQNFS 
FFYKTTTKPE AKTTQSANTA vmvqntqmnn 
55 ITKTSLQQYG SQADLKKIIG etknqllldr 
VGMPTLDFKA KQKIiLLDVLD QYKDFFGNNA 
SYKDIDGLSIi SSSNGTSSKF ASDWAALLL 
GIDLLK 

<212> Type : PRT 
60 <211> Length : 726 

SecpienceName : SEQ ID 127 
SequenceDescription : 

Sequence 

65 

<213> OrganismName : Mycoplasma pneumoniae 
<400> PreSequenceString : 



NTSNLFPRQN RSKQLIGFTE 
DRNIATRLTI FSENVEDEHD 
LIADFQSLIF SNIVANVQLT 
AYLFAQWWE ENPLPLTQAF 
QGKTTFDPTP NSAQNFGDFI 
VTKSNLRGAF KGVGLATAII 
VAIRSIVSRA KIAMTDQTPG 
HYLQDAVRLT SSQAMAAAST 
STLQTPVYAR GDDGTYALAI 
SLNLLESGSL FQTWAQTGLT 
RQQSTDTQES EWKLAKSLL 
KKDLGEVSGL jQQAIIDRADK 
SAADGKVKAS ESGSAALVTL 
SLITKWGATP QIGSQFSEIV 
NDFKNNYDSE KKELKGNEYK 
GMVSLSSSAA VANAVASSGM 
LVKDGAKNYK RLLQQAITVG 
NPFNKFVQNP DYVQGSETNW 
TTNEKHFFGF NGLTINSPQS 
LIQNTNSDAE LNAFGEVLHR 
FNKFEGYVGQ TKVKMSDSSS 
LVEKLIRTVL 



mrilKPEAVIi 60 

NLLDQKQQAE 120 

DGSDQFKPTT 180 

FAYQAPKDGL 240 

KAVFPEQKNG 3 00 

DQYEYLVGGS 3 60 

FKVNPAFVKV 420 

GADSSSGTNV 480 

DGGDYFLA3SI1T 540 

AiCLYGALVAM 600 

KSSADLAKPF 660 

YI QLEKQAQK, . 720 

KTTDSQKSTN 780 

SLKSKDNKPQ 840 

DFNDIiVKQTL 90 0 

QKSGDQTLLE 960 

TRAFVSWTVS 102 0 

FNDKSTPIKP 1080 

VSTASAGLTE 1140 

AVNVDTSNLG 12 00 

SSQGTKTIRK 12 60 
1300 



DGFVKFIKGG SGGSNGGSSS 
KGWFDKHRGL ILANALVKVT 
QMKALNNKLL ERVRSLNQNV 
WPKVGTEDTN VSEEKSKLKT 
ILLNINNDKV WSAGANWSLA 
KTSGSGKSGT TTNDDADALS 
NEKSFNVDYK AVIEHYRFIY 
NPQGIPGSYF NYENETNAAK 
QQTMSYGFTG LSTSSGSMLG 
lANQLIALKP NTSGNSGTQK 
QAVQRDSGKS GTGNYLTYTD 
FQAAYKGTQQ LALSSINKPQ 



AKKIDKEEQK 60 

LDTKEKASKA 120 

NQANPTPWLN 180 

KTEDWKIRE 240 

VLLDPKKVNP 3 00 

KVIGNYYYNT 360 

TLEWLVDKNL 42 0 

SATQIIDPNS 480 

AATQQAILDQ 540 

TIAAYFQTDA 600 

GSDKITYLQF 660 

LPIGDKRIKT 720 
726 
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MKSFLRKPKF WLLLLGGLST SSIILSACAT 
TALRDPETNK KFVAAPLLKA LEAWYENNQD 
NKSLFVQQDL LDSSGGSEAT WKARKLFEQL 
TISKNSNWQN IVFDAVNFPE TNDDFFAKIQ 



PSNSALQAVF KPTSNQFFNG EHGTIQSALN 
KNITQFLKDT KTITVDNQYKT WDKWSAPR 
ISDFASRVFQ KNYLSYKENG KVSAGPFLYD 
SEVFDQWAEY TDPTIISSVT LKYSAPN 



60 
120 
180 
237 



10 



15 



20 



25 



30 



35 



• 40 



45 



50 



55 



60 



<212> Type : PRT 

<211> Length : 237 

SequenceName : SEQ ID 128 
SequenceDescription : 

Secjuence 



<213> OrganismName : Mycoplasma pneumoniae 

c400> Pre SequenqeSt ring : ... 
MINFLFNQMN AIiNNKFLERA KALNQNVNQA NPTPWLNGLS AKLPYVRTNG NYEKLNNYFT 60 
FLIVKYMWKK VGNEDASLSK DSSINKLKTK TEDVNKIRDK ILEDIQKKVQ EFVKNKLKPT 120 
LAPRQTYSNV ILLNVIJNDKV WSMGANWALA NLLDTSKINP LSFMLLKQTF DQNDLFKKAK 18 0 

KLFEDIQSKT NGGSSGGMQG SNTSSSEGAD ALSKVIGNYY YNSWAKLTDK SIYGNPKDNK 240 
FDDLFKLAFE DSINEKSFNV DYKAVIEHYR FIYTLEWIiVN GNLKNFKDLL KANLKFGEIA 3 00 

FIAYKNTETK EFSNPQGVFG SAFNYENETN EVKIAAQNLD PNNFFYKTTT KPEEVKTAQM 360 
GASMMVMQQK MQSTMQDSNH YGFTGLNTST SSMLGAATQQ AILDQITKMS LQQYGSQQEL 42 0 

KTLIEKTNNQ LLLDRIASQL SGLNPSTTGN SNNGKGKNIA TYFQLDAIGN PTLSFQQKRK 480 
LLLDVLDQYK DFFGTNTQAA QRDSGKGGHG SYSTYQDGSD KITYLQPSYK DIDNLSLSDK 540 
GNSKLASDW AALLLFQAAD KGTQQLALSA IN 572 
<212> Type : PRT 
<211> Length : 572 

SequenceName : SEQ ID 129 

SequenceDescription : 

Sequence 



<213> OrganisniName : Mycoplasma pneumoniae 
<400> PreSequenceString : 

MKKFLRKPQF WLIiTLGGFLS TSVILAACAT PSNSALQTVF KZVRSSQFFNG EQGSLQSALT 60 
TALKNPVANK QFIAAPLLKA LEAWYENNED KKITQFLKDT KSNVDSQYTT AVDKWSASR 120 
NKSLFVQQDL LDNAGGSEAT WKAQKLLEQL ISDFASRVFQ KNYLNYKKDG QVSTGPFTYD 180 
ELHKEESWKN FEFSAPRFSE TNDDFFAKIQ SQVFDQWVEY TDPTLISQVN YKYSAPSQGL 240 
GQIYNREKLK DKLTPSYAFP FFAEEKDIAP NQNVGNKRWK QLVKGEGAIT DNNIGQSGTN 3 00 

SQKTGLLKYR NESNKGDFLD FPLNLSDTNE TKQLVDASNI VDQLEAANLG AALNLKLQVF 3 60 

EQDNDELPQI KELKEDLNNT IWDKSKDVE KASKTNALFY NDQEGKQQQS DSDPIAGALD 420 
DIFAQNTSEG TNLSKLAEQV KKAAATKMEA KTAVLRTNNS KGQQNNYWL DAAIPTFNST 4 80 

TSKSKNNSAS NEVLVALKSG SINLRQVQQT DQNSYSPIKF RIVRNSTGVT VFGLDGGSYY 540 
LKQDSTNKKS VSKQSLTLLT KSSSGNSNKV LRDLDKQKQF LKFRAFQAKT NTFYSTNFAF 600 
SFPLNETLKS WFDKHRELIL ANALVNASLD QKDKASKALT EAFNPYKELI KEFAPVALAT 660 
TMISFYFDQM KALNNKLLER ARNLNQNVNQ ANPTPWLNGL SAKLPYVNTN GNYEKLNNYF 720 
TFLITKTLWP KVGQEETSIS EESNKLKTKT ADVDKIRDKI LENIQTKVND FVKNKLKPAL 780 
APRPAYSNVI LLNVNNDKVL SSGANWSLAS LLQSDKVNPL SFMLLKQAFD NNDLFKKAQK 840 
LFKDIQEKSS NNGGMQSSST TNSDADALSK VIGNYYYTTW AKLTDKSIYG NPKDNKFDEL 9 00 

FKLAFEASID EKSFNVDYKA VIDHYRFIYT LQWLVDQKLK NFKSLLKTNL KFGEVAFIAY 9 60 

KNTETTNFSN PQGVFGSYFN YENSASEVKE STQTLDPNNF FYKTTTKPTV QAIQQVASLA 1020 

LVQKQQMQQN STDHYGFTGL STSTSSMFDA SSRDAILQQI TKTSLQQYGS KDQLKKIIQG 10 80 

TNNQLLLDRI AVQLSGLNPS TTNGGSGKTI ATYFQVDAVG NPTLDFQAKR KLLLDLLDQY 1140 

QNYFGNGAQK SQRDSTPSGT GNYLTYQNGS DKYTYTQFTY QDIDSLSLTT TSGTNNKIAS 1200 

DWAALLLFQ AADKGTQQLA LSAINKPQLN IGDKRIESGL KLLK 1244 
<212> Type : PRT 
<211> Length : 1244 

SequenceName : SEQ ID 130 

SequenceDescription : 

Sequence 



<213> OrganismName : Mycoplasma pneumoniae 
<400> PreSequenceString : 

MVGSGAAGSA SSLQGNGSNS SGLKSLLRSA PVSVPPSSTS NQTLSLSNPA PVGPQAWSQ 60 

PAGGATAAVS VNRTASDTAT FSKYLNTAQA LHQMGVIVPG LEKWGGNNGT GWASRRDAT 120 

STNLPHAAGA SQTGLGTGSP REPALTATSQ RAVTWAGPL RAGNSSETDA LPNVITQLYH 180 

TSTAQLAYLN GQIWMSSAR VPSLWYWWG EDQESGKATW WAKTELNWGT DKQKQPVENQ 240 
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LGFKDDSNSD SKMSNLKTQG LTQPAYLIAG LDWADHLVF AAFKAGAVGY DMTTDSNAST 3 00 

YNQALVWSTT AGLDSDGGTR LW 322 
<212> Type : PRT 
<211> Length. : 322 
5 SequenceKTame : SEQ ID 131 

SequenceDescription : 

Sequence 



10 <213> Organ isinN"ame : Mycoplasma pneumoniae 
<400> PreSequenceString : 

MPVFLKLTHT IRKVLRVARL SRIoALLSLTA VIFSGCANIN LISAVGSSSV QPLLSKLSSH 60 

YVLMHNDKDN LVEISVQAGG SSAGVKAITK GLADIGNVSK NTKSYAEENK QLWMDKKLKT 120 

XTI.GKDAIAV lYKAPSEFKG KLVLTKDNLN DLYDLFAGSK SVDINI-CFVEN GQTTKNSiJHN- . _ -^18.0 

15 LIGFPRTGGA FASGTAEAFL KFSGLTQTKT LDKDSKEILE GQRNYGPNAR PTSETNIEAF 240 

NTFVTTLRQP NLYGMVYLSL GFVNNNMNItl KSEGFEVLKV KYDNNAVTPS SQAVSSNTYK 3 00 

WVRPLNSWS liLPKQKTLPS IQRFFNWLLP SNNSEIKKIY DDFGVLELTA DEKKKMFKT6 3 60 

NAEMSNIANF WVDDYSLNNQ TFGAL 385 
<212> Type : PRT 
20 <211> Length : 385 

SequenceName : SEQ ID 132 

SequenceDescription : 

Sequence 
25 

<213> OrganismName : Mycobacterium tuberculosis H3 7Rv 
<4 00> PreSequenceString : 

MSFAVLPPEI NSARLYVGAG LAPMLDAAAA WDGLADELGS AAASFSAVTA GLAGSSWLGA 60 

ASTAMTGAAA PYLGWLSAAA AQAQQAATQT RLAAAAFEAA LAATVHPAII SANRALFVSL 12 0 

30 WSNLLGQNA PAIAATEAAY EQMWAQDVAA MFGYHAGASA AVSALTPFGQ ALPTVAGGGA 180 

LVSAAAAQVT TRVFRNLGLA NVGEGNVGNG NVGNFNLGSA NIGNGNIGSG NIGSSNIGFG 240 

NVGPGLTAAL NNIGFGNTGS NNIGFGNTGS NNIGFGNTGD GNRGIGLTGS GLLGFGGLNS 3 00 

GTGNIGLFNS GTGNVGIGNS GTGNWGIGNS GNSYNTGFGN SGDANTGFFN SGIANTGVGM 360 

AGNYNTGSYN PGNSNTGGFN MGQYNTGYLN SGNYNTGLAN SGNVNTGAFI TGNPNNGFLW 42 0 

35 RGDHQGLIFG SPGFFNSTSA PSSGFFNSGA GSASGFLNSG ANNSGFFMSS SGAIGNSGLA 480 

NAGVLVSGVI NSGNTVSGLF NMSLVAITTP ALISGFFNTG SNMSGFFGGP PYFISTLGLANR 540 

GWNILGNAN IGNYNILGSG NVGDFNILGS GNLGSQNILG SGNVGSFNIG SGNIGVFNVG 60 0 

SGSLGNYMIG SGNLGIYNIG FGNVGDYNVG FGNAGDFNQG FANTGNNNIG FANTGNWNIG 660 

IGLSGDNQQG FNIASGWNSG TGNSGLFNSG TNNVGIFNAG TGNVGIANSG TGNWGIGNPG 720 

40 TDNTGILNAG SYNTGILNAG DFNTGFYNTG SYNTGGFNVG NTNTGNFNVG DTNTGSYNPG 78 0 

DTNTGFFNPG NVNTGAFDTG DFNNGFLVAG DNQGQIAIDL SVTTPFIPIN EQMVIDVHNV 84 0 

MTFGGNMITV TEASTVFPQT FYLSGLFFFG PVNLSASTLT VPTITLTIGG PTVTVPISIV 90 0 

GALESRTITF LKIDPAPGIG NSTTNPSSGF FNSGTGGTSG FQNVGGGSSG VWNSGLSSAI 960 

GNSGFQNLGS LQSGWANLGN SVSGFFNTST WLSTPANVS GLNNIGTNLS GVFRGPTGTI 102 0 

45 FNAGLANLGQ LNIGSANLGD FNLGSGNVGS FNVFSGNQGS YNIGPANLGN YNIGFANLGN 1080 

YNIGFGNAGD FNQGFANTGN NNIGFANTGN NNIGIGLSGD NQQGFNFAGG WNSGTANIGL 1140 

FNSGTNNVGI GNSGTGNWGI GNSGSGNTGI GNTGSTNTGF FNTGIVNTGV ANAGSYNTGW 1200 

YNTGDTNTGI ANLGDFNTGF YNTGNFSTGF ANQGDIATGA FITGDMGNGA PWRGDQQGLF 1260 

SAGYRVHVPE IPAHVTVEVP VNIPITASFT NTVYSGITLE QIMFGFTIDI AGIPLLAGAI 13 2 0 

50 SKAVLPPITG TGPAITVNIG DPGGSTAIRI PATASVGPFD VTFVNIAATT GFFNATTDPS 13 80 

SGFFNGGPGT VSGIANIGAN ISGFQNVANS ATSGFNNYGS LQSGLAMLGD TVSGVFNTGI 1440 

GAPANVSGMF NIGSNLAGFF HDQATGMSMF NLGLGNIGQF NVGFSNVGDS NAGLANIGSF 15 0 0 

NLGSGNLGSF NVFGGMQGSY NIGPANLGNY NIGLGNLGSY NFGFGNAGDF NLGFANTGNN 1560 

NIGFANTGNN MIGIGLSGDN QQGFNFAGGW NSGSGNSGLF NSGTNNIGLF NSGTGNIGIG ' 1620 

55 NSGTGNWGIA NTGDTNTGIF NTGDVNTGLL NAGNVNTGIF NTGHYNTGSF NAGSFNTAGF 1680 

NPGSYNTGYL NTGSYNTGLA NSGDVNTGGF ITGNYSNGFW WRGDYQGLAG ISQTITVPDT 1740 

AVPVKLHVPI FLDIPVTGTL GTFTVHGFRF PEITGDIFLI GIPFNAATLD AFSFPNISIV 180 0 

LPNIGINLGS GPDPLIDIAG TGGLLPIKIP LIDIPAAPGF GNSTTTPSSG FFNAGTGTVS 18 60 

GVGNVGSNSS GFFNLTSGSS GISGVQNFGE LISGGFNFGN TVSGLVNAST LGLSMPANLS 1920 

60 GGGNVGATVA GFVNNTQILN LGFGNVGSGN VGHGNIGDSN VGLGNLGNAN VGHGNIGSFN 1980 

VFSGNRGSYN IGPANLGNYN IGLGNLGSYN FGFGNAGDFN LGFANSGSNN IGFANTGNNN 204 0 

IGIGLSGHNQ QGFGSWNSGT ANTGLFNSGT NNIGLFNSGT GNIGIGNSGI GNTGIGNPGV 2100 

GNTGLGNSGT GNWGLWNPGT GNMGVANVGT YNTGGYNVGS TNTGIANVGI ANTGSYNTGS 2160 

TNTGSPNDGD FNTGFYNTGD YNTGFYNTGD VNTGAFIGGN FSNGAFWQSD HQGQWGAHYA 222 0 

65 ITVPQIPLLN FSLNIPVNIP IHLDFGTLAV NGFQIPAITL RALGVTHFSV GPIIVPRIAG 22 80 

TLPVIDINIG DP6GSSSIPI TITSGAGPW IPLLDIPPAP GFGNSTTGPS SGFFNSGTGS 2340 

SSGFGNVGAN NSGPWNXAFA GIGNSGLQNF GSLQSGWANL GNTVSGFYNT SAADFATPAN 2400 
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LSGLSNVGAD LTGVLRGPNG STFNAGLANL 
GFGNIGNAlSri GGANIGDPNV 6XANTGPGLT 
TGNNNIGIGL SGDNQIGFGP LNAGIAIJMGL 
PNTGNNNVGI WLT6D6LSGP SSLNSGAGHT 
5 GIGNMGTGGF GVGLSGDSQV GIGGTKTSGSF 
NT6IGNSGNY NTGLLNAGLV NTGIANPGNH 
NTGMANAGDY GTGAFXTGSM NNGLLWRADR 
GDITNVSIPA ITFPRIDASG SVDIGILSGT 
PAINLNIGKP DGSTVINIVG GAGAGPISIP 

10 GIiLNFGNNSG LYNFATSSMG NSGFQNYGSL 
IGTNIiAGWLQ NGPTETTFSV GLANLGFWNL 
GSANIGDFNL GSANIGSSNI GFGUVGPGLT 
TGNGNIGIGL TGDTMTGFGG WNSGTGNIGL 
GNTGSTNSGF PNTGLVNTGX GNSGDYNTGL- 

15 FNPGNSNTGI ANS6DVNTGA FNSGNYSNGF 



GQFNVGSANIi GSANLGSANL GSAlsILGNSNV 2460 

AAVNNIGXGBr TGNYNIGVGN TGNYNIGFGN 2520 

FNLGDNNFGM ANAGNFNQGI ANTGNlsINIGL 2580 

GFFNSGTANT GLFNSGTGNT GLFNSGTGNV 2640 

NIGIiFNSGTG NVGIGNSGTG NVGIGNTGTG 27 00 

NTGLFNIGTF NTGIANPGHY NTGSYNTGSY 2760 

QGLIiAANYTI TIERPAAFLU VDIPVNIPIT 28 2 0 

VLAPVGPITL HGGDASAPLD TPIEIDFGPS 28 80 

IXDIiRPAPGF FNATTGPSSG FLNWGAGSAS 2940 

QSGWANLGWS ISGIYNTGLG APANVSGLLN 30 00 

GSANIGNYNL GSANIGVYNL GSANIGDFNL 3 0 60 

AAIGNIGFGN TGNGNIGIGN TGTGNIGFGN 3X2 0 

FNSGTGNIGF GNSGTGNWGI GNSGDYNTGI 318 0 

FNAGNTNTGS FNPGDYNTGG FNPGNYKTGY ...324.0 

FWRGDYQGLG GFAYQSAVSE IPWSYDRFQH 33 00 



<212> Type : PRT 
<21X> Length : 3300 

SequenceName : SEQ XD 133 
20 SequenceDescription : 

Sequence 



<213> OrganismNarae : Mycobacterium tuberculosis H3 7Rv 

25 <400> PreSequenceString : 

MNIiVSTTSGM SGFLNVGALG SGVANVGNTI SGIYNVGTSD LSTPAVNSGL ANIGTNIAGL 60 

LRDGAGTAAI NLGLANHGNL NVGFASLGGF NFGGATIGHN NVGIGNTGIF DVGIiANLGSY 12 0 

NIGFGNLGDD NIiGFGNFGSY NIGFGNVGND NLGFANAGGG NIGFANTGSN MVGFGNTGSN 180 

NVGIGIiTGNG QIGFGSFNSG SGNIGLFMSG SNNIGFFNSG SGNFGIANSG SFNTGIGNTG 240 

30 NTNTGIiFNSG DVNTGAFNPG SFNTGSFNTG SFNTGGFNPG NTNTGYLNIG NYNTGIANTG 3 00 

DVDTGAFITG NYSNGLFLSG DYQGLVGLNL VIDMPLPISIi GVNIPIDIPI TASAGNITLM 3 60 

GVTXPPTGDI VLSSXAGQRA HFGPITIPNI TWGPTTTVA IGGPNTAXTI TGGGAIRIPL 42 0 

rSIPA2^GFG NSTTNPSSGF FNTGAGGASG FGMFGGANSG FWNLASATSG ASGLLNVGAL 48 0 

GSGLAWGTT VSGFYHTSTS DIiATPAFNSG LANISTSIAG LLRDSTGTMV LNLGLAKHGT 540 

35 LNVGIANLGD YNIGFANLGS ANFGSANIGG NNIGGANTGI FDXGLANLGS YNIGFGNFGD 600 

DNIiGFGNLGS YNVGFGNIiGN DNLGFANTGS NNIGFANTGS UnsTIGlGLTGD GQIGFGSLNS 660 

GSGNIGLFNS GSGNIGFFNS GNGNVGIGNT GTANFGLGNT GSTNTGFFNS GDWTGIGNT 72 0" 

GSFNTGSFNP GDSNTGDFNP GSYNTGLGNT GDVDTGAFIS GSYSNGFLWS GMYQGLIGLH 780 

AALAIPEIAL TFGVDIPIHI PINIDAGWT LQGFSIVAAE NNXDFTPIII PTINITLPTA 840 

40 AITVGGPTTS XGXTASAGXG SITIPIIDXP ATSGFGNSTT SPSSGFFNSG AGSASGFLNV 90 0 

VAGASGISGY LNVGALGSGV TNVGHTVSGF YNASALDLVT PAFASGLMRD GMGTMTLNLG 960 

IiAlsniiGSNNAG FGNTGIFDVG VANLGNYNIG FGNFGDDNLG FANLGSYNIG VANTGSIMIG 102 0 

FANTGSNNIG IGLTGTGQXG IGALNSGSGN IGLFNSGDGN IGFFNSGTGN FGIGNTGTGN 1080 

FGIGNSGSTS TGLFNSGDGN TGGFNPGNFN TGNFNTGSFN TGGFNAGNTN TGHFNTGNYN 1140 

45 TGIANTGDVS TGAFISGNYS NGILWRGDYQ GLXGYSYALT XPEIPAHLDV NIPIDIPITG 1200 

SFTDLWDNF TIPIIGFESF AFSFHIHTEP DIGPIIVPSF VLSVPTFAXA VGGPTTAINI 1260 

SATAGLGPIT IPIIDIPAAP GIGNSTTSPS SGFFNTGAGT ASGFGNVGGN TSGLWNLASA 1320 

ASGVSGLLNV GALGSGVANV GNTISGXYNT SPLDLGTPAF GSGLANXAGLt LQGGAGTTIL 13 8 0 

DLAGLGNLW GIiANLGGSNF GIGNTGIFNV GFANVGNHNI GliANLGNYSV GFANSGNYHI 1440 

50 GIANTGSANI GFANTGSGNX GIGLTGTGQI GFGSFNSGSH NIGLFNSGDG NVGFFNSGTG 15 00 

NVGIGMTGTA NFGXANSGSF NTGLGNTGST NTGLFNPGNV NTGVGNTGSI NTGSINTGSF 1560 

NTGSTNTGSF NLGDHNTGSF NSGDYNTGYF NAGDYNTGVA NTGNVNTGAF ISGMYSNGFF 1620 

WRGDYQGLXG LSTTITIPEX PYRYDLSVPX DIPITGTWA TTPNSFTIPG FQIRVLLGPA 16 8 0 

AVLWEMIGP ITXDVNQVXA IDSPIQQTIS MVGTGGFGPI PXGISIGGTP GFGNSTTGPS 1740 

55 SGFFHTGAGH VSGFGNFGAG NMSGSGNFGA GNSGFFNAGG LGNSGLLNFG ALQSGIANLG 18 00 

NTISGVYNTS TLDLATPAFG SGXANIGANL AGLFLDNTGN LTLNFGVANQ GGLNAGIGNL 1860 

GSVNIGFVNT GDSNLGIGNL GDLNFGGVNI GGNNIGIANT GIFDIGLANL GSYNIGLANL 192 0 

GDDNLGFGNA GSYNIGFANF GSDNLGFANT GSYNXGFANT GNNNIGVGLT GMGQIGIGSL 1980 

NSGSNNXGLF NSGSGNXGFF NSGTGNVGIF NTGTGNFGDA NSGGFNTGIG NAGSTNTGVF 2040 

60 NPGDLNTGSF NPGSFNTGGF NPGSGNTGYL NTGDYNTGVA NTGDVDTGAF XTGSYSNGFL 2100 

VSGDYQGLIG LPIiLGIPVTP GYFNLTGGPS SGFFNSGAGS VSGFVNSGAG LSGYLNTGAL 2160 

GSGVANV6NT ISGWLNASAL DIiATPGFLSG XGNFGTNLAG FFRG 2204 
<212> Type : PRT 
<211> Length : 2204 

65 SequenceName : SEQ XD 134 

SequenceDescription : 
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Sequence 



<213> OrganisiriName : Mycobacterium t\aberculosis H37Rv 
<400> PreSequenceString : 

5 MSFVLIAPEF VTAAAGDLTN LGSSISAANA SAASATTQVL AAGADEVSAR lAALFGGFGL 60 

EYQAISAQVA AYHQRFVQAL STGAGAYASA EAAAAEQIVL GVINAPTQAL LGRPLIGDGA 12 0 

NATTPGGAGG AGGLLFGNGG AGAAGAPGQA GGPG6PAGLW GNGGPGGAGG SGGGTGGAGG 180 

AGGWLFGVGG AGGVGGAGGG TGGAGGPGGL IWGGGGAGGV GGAGGGTGGA GGRAELLFGA 240 

GGAGGAGTDG GPGATGGTGG HGGVGGDGGW LAPGGAG6AG GQGGAGGAGS DGGALGGTGG 3 00 

10 TGGTGGAGGA GGRGALLLGA GGQGGLGGAG GQGGTGGAGG DGVLGGVGGT GGKGGVGGVA 3 60 

GLGGAGGAAG QLFSAGGAAG AVGVGGTGGQ GGAGGAGAAG ADAPASTGLT GGTGFAGGAG 42 0 

GVGGQGGNAI AGGINGSGGA GGTGGQGGAG GMGGSGADNA SGIGADGGAG GTGGNAGAGG 480 

AGGAAGTGGT GGWGAAGKA GIGGTGGQGG AGGAGSAGTD ATATGATGGT GFSGGAGGAG 540 

- - ' GAGGNTGVGG TNGSGGQGGT. GGAGGAGGAG GVGADNPTGI GGTGGTGGKG QAGGAGGQGG .600. 

15 SSGAGGTNGS GGAGGTGGQG GAGGAGGAGA DNPTGIGGAG GTGGTGGAAG AGGAGGAIGT 660 

GGTGGAVGSV GNAGIGGTGG TGGVGGAGGA GAAAAAGSSA TGGAGFAGGA GGEGGAGGNS 72 0 

GVGGTNGSGG AGGAGGKGGT GGAGGSGADN PTGAGFAGGA GGTGGAAGAG GAGGATGTGG 7 80 

TGGWGATGS AGIGGAGGRG GDGGDGASGL GLGLSGFDGG QGGQGGAGGS AGAGGINGAG" 840 

GAGGNGGDGG DGATGAAGLG DNGGVGGDGG AGGAAGNGGN AGVGLTAKAG DGGAAGNGGN 900 

20 GGAGGAGGAG DNNFNGGQGG AGGQGGQGGL GGASTTSINA NGGAGGNGGT GGKGGAGGAG 960 

TLGVGGSGGT GGDGGDAGSG GGGGFGGAAG KAGGGGNGGR GGDGGDGASG LGLGLSGFDG 1020 

GQGGQGGAGG SAGAGGINGA GGAGGNGGDG GDGATGAAGL GDKTGGVGGDG GAGGAAGNGG 10 80 

NAGVGLTAKA GDGGAAGNGG NGGAGGAGGA GDNNFNGGQG GAGGQGGQGG LGGASTTSIN 1140 

ANGGAGGNGG TGGKGGAGGA GTLGVGGSGG TGGDGGDAGS GGGGGFGGAA GKAGGGGNGG 1200 

25 VGGDGGEGAS GLGLGLSGFD GGQGGQGGAG GSAGAGGING AGGAGGTGGA GGDGAPATLI 1260 

GGPDGGDGGQ GGIGGDGGNA GFGAGVPGDG GDGGNAGFGA GVPGDGGIGG TGGAG6AGGA 13 20 

GADGDPSIDG GQGGAGGHGG QGGKGGLNST GLASAASGDG GNGGAGGAGG NGGDGDGFIG 13 80 

GSGGTGGTGG DAGVGGLANT GGTAGNAGIG GAGGRGGDGG AGDSGALSQD GNGFAGGQGG 1440 

QGGVGGKTAGA GGINGAGGTG GTGGAGGDGQ NGTTGVASEG GAGGQGGDGG QGGIGGAGGN 15 0 0 

30 AGFGAGVPGD GGIGGTGGAG GAGGAGADGD PSIDGGQGGA GGHGGQGGKG GLNSTGLASA 1560 

ASGDGGNGGA GGAGGNGGDG DGFIGGSGGT GGTGGDAGVG GLANTGGTAG NAGIGGAGGR 1620 

GGDGGAGDSG ALSQDGNGFA GGQGGQGGVG GNAGAGGING AGGTGGTGGA GGDGQNGTTG 1680 

VASEGGAGGQ GGDGGQGGIG GAGGNAGFGA GVPGDGGIGG TGGAGGAGGA GADGDPSIDG 1740 

GQGGAGGHGG QGGKGGLNST GIiASAASGDG GNGGAGGAGG NGGAGGLGGG GGTGGTNGNG 1800 

35 GLGGGGGNGG AGGAGGTPTG SGTEGTGGDG GDAGAGGNGG SATGVGNGCaT GGDGGNGGDG 1860 

GNGAPGGFGG GAGAGGLGGS GAGGGTDGDD GNGGSPGTD6 S 1901 
<212> Type : PRT 
<211> Length : 1901 

SequenceNarae t SEQ ID 135 . 

40 SequenceDescription : 

Sequence 



<213> Organi smNarae : Mycobacterium tuberculosis H3 7Rv 

45 <40 0> PreSequenceString : 

MSLiVIVAPET VAAAALDVAR IGSSIGAANA AAAGSTTSVL AAGADEVSAA lATLFGSHAR 60 

EYQAISTQVA AFHDRFAQTL SAAVGSYVSA EATNAAPLAT LEHNVLNALN APTQALLGRP 12 0 

LIGDGAAGAP GTGQAGGAGG ILWGNGGAGG SGAPGQVGGA GGAAGLFGTG GAGGAGGAGA 180 

AGGAGGSGGW LLGNGGVGGA GGQSLLGGAT GGAGGNAGLF GVGGTGGPGG PGGPGGVGGT 240 

50 GGAGGLGGTL YGAGGHGGAG GPGPIGGVGG HGGVGGAAGL LGVGGHGGAG GHGAEGVAGA 3 00 

AGEDLSPHGT SGGVGGDAGD GGTGGRGGWL AGAGGAGGAG GVGGTGGAGG AGFSRALIVA 3 60 

GDNGGDPGAG GAGGTGGAGS TIGAHGAAGA SPTSGGNGGA GGNGAHFSSG GKAGGNGGAG 420 

GAGGLVGNGG AGGAGGNGAP GAPPSGGDPN GGGGGAGGAG GKGGDGGAQA GDGGAGGAGG 48 0 

KGGNGGNGAT GATGLNGLGA GADGTDGGKG GNGGAGGGGG AGGQGGKALA ATHQDGSMGA 540 

55 GGAGGNGGAG GMGGDGGNGA KGTFDNGGDG VGGNGGNGGS RGIGGAGGIG GAGSTAGADG 600 

ARGATPTSGG NGGTGGNGAN ATVAGGAGGA GGKGGNGGLV GNGGAGGKGG DGMAGVAGSS 660 

PTTAGESGTS GQNGGAGGAG GAGGRGGDFG GDGGTGGAGG NGANGANATT PGAKGGDGGH 72 0 

GGPGAQGGNG GQGGPGGLAG NLFGQNGIQG VGGSGGKGGA GGLAGDGGNG ANGNFAFGDG 78 0 

NGGHGGNGGN PGAGGQGGSG GAGSTPGAKG AHGFTPTSGG DGGDGGNGGN SQWGGNGGD 840 

60 GGNGGNGGSA GTGGNGGRGG DGAFGGMSAN ATNPGENGPN GNPGGNGGAG GAGGAGLNGG 90 0 

NGGAGGNGGL GGFGGNGAAG ANGVAVGAPG QPGGAGGHGG AGGNGGAGGN GGQGWSDGA 960 

GGAGGAGGDG GAPGDGANGG NGQGAGAFAG GGGGRGGDGG NAGNAGAGGP GGTGSTAGKA 102 0 

GPAGSILHDG GNGGHGGHGA ASGGNGGPGG HGGNGGNGGT GANGGNGGIG GTGGAGSTGA 1080 

KGVLGTNEGD GGDGGRGGNG GRGGNGGQGL TGAGGNGGTG GTPGNGGNGG NGASGDLVTS 1140 

65 PGDGGGGGRG GDAGRGGDAG LGGSSGPGGT PGDWGTGGTG GTGGTGGQGA NGGLTGGRGG 1200 

TGGNGGNGNT GGTGGAGGTG GTGHNGSQPG MGGNGGAGGF GGNGFAGVGG RGGMGGSGGT 1260 

GGTGDAGPFG TGTGGTGGHG GQGGGGGFSI LLGLGGLGGL GSPGSIATGT AGGAGGGGGF 1320 
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GGLGGGBFV 1329 
<212> Type : PRT 
<211> Length : 1329 

SequenceName : SEQ ID 136 
5 SequenceDescription : 

Sequence 



<213> OrganistnName : Mycobacterium tuberculosis H37Rv 

10 <40 0> PreSequenceString : 

MSYVIATPEM MATAAFDIiAR IGSQVSAASA VAAMPTTEW AAGADEVSAG lAAIiFSAHAQ 60 

EYQALSAQAA AFHDQFVHTL TAAARWYTAT EIANAAAMRV VLGAVNAPTQ TLLGRPLIGD 12 0 

GAHGTAPGQP GGAGGLLFGN" GGNGAAGAVG QVGGAGGAAG LFGIGGAGGA GGAGAPGGTG 18 0 

GTGGWI^AGGG GVGGMGGAGG .GAGGAGGHAG JjFGNGGAGGA GGAGGGAGGA GGNAGWFGHG 24 0. 

15 GAGGVGGVGA AGANGATPGQ DGAAGVAGSD DGAGGDGLAG SDGGDGGAGG VGGNGGRGGW .3 00 

LLGNGGAGGV GGVGGAGGAG AAGGAGGAGA TGINGPAGIS AAGGDGGAGG NGGAGGNGGV 3 60 

GGAGGAGGSA GLLGYVGRAG DGGAGGGGGL GGAPGDGGAG GNGGSWLAAG DGGAGGHGGD 42 0 

PGLGGAGGAG GASGGAGARA GANGLAAGND GPVSGGMGGK GGNGAHAPVA GGHGGNGGAG 48 0 

GNGGIiVGDGG AGGHGGDGAA GAGYADMTAI FLGSSGTPGE DGGNGGAGGA GGAGGAHAGD 540 

20 GGAGGAGGNG GAGGAGGNGA HGFNAVLVSD GGNGGDGGAG GRGGDGGAGG AGGDAPAGRA 600 

GSQGVGGDGG AGGAGGAPGN GGSGGRGDMA FKDGDGGAGG DGGDPGAGGK GGAGGAGATE 660 

GVTGATGATV HSGGNGGKGG NGADATVAGA NGGKGGAGGN GGLVGDGGAG GDGGSGAAGA 72 0 

NGANVGEDGA DGTLSGQPGE GSEANGGQGG VGGGGAGGAG GDGGAGSSAL GSGGNGGRGD 78 0 

AGQAGGAGGA GGAGGAGGSV SGDGGPGGKG GAGGAGGAGA SGGGGGKGAS GADSAEAVGG 84 0 

25 AGGKGGDGGV GGVGGDGGPG GDGGAGGAAP AGQVGSHGVG GVGGDGGLGG AGGNGGDGGH 900 

GSDGGDGGDG GDPGAGGLGG LGGDSGNGTR AASGVDASDH GPGSGGNGGM GGNGAQASVA 960 

GGAGGNGGDG GNAGRVGDGG AGGNGGDGAA GANGANSGAP GSDALALGQP GGNGGQGDAG 1020 

QAGGAGGAGG AGGAGGSVSG DGGAGGNGGA GGNGGVGASG GAGARGANGI DSIGGTGGAG 10 8 0 

GGGGDGGAGG VGGHGGDGGV GGAAPSGTVG SHGTGGVGGD GGLGGAGGVG GAGGNGGIGI 1140 

30 TVGGAGGAGG NGGDPGAGGR GGLGGDSGNG TSAANGVDAS KHGPLTGGDG GVGGNGAKAA 12 0 0 

AAGGDGGQGG DGGNAGLFGD GGAGGDGADG TAAEALGGDG GAGGAGGKGG DAGDIGDGGD 1260 

GGKGGDGAHG ALGGLTVAGG NGGAGGAGGA GGAGGAFLGD GGNGGAGGQG GAGRGGSPGG 13 20 

GGGVGGHGGA GGDAGMMGGG GTGGQGGNGA AGGAGWSPDS DLKGFDGFDG GSGGAGGDGG 13 80 

AGGAGGTQTG DGGDGGAGGL GGAGGVGGNG VDGFDINETT GRDGGDGGDG GYGGWGGAGG 1440 

35 NGGAGGSAPA GEVGNRGVGG DGGDGGSGGD AGNGGLGGDG FTYLADFDGE PGGDGGDGGD 1500 

GGWGRPGGQG GFGSTSGAHG KAGFGAPGGD GGDGGNGGHG GDGNGSFADA GDGGPGGNGG 1560 

NGGLGGAGRD GGAPGGDGGD GGTGGSGGFG APPPRSIGGG DGGDGGRGGD GGRGAG(^TS 1620 

GGVGSSGESG GSGNGRGDPG SGGS6GEGGE GGPSISVNVT 1660 
<212> Type : PRT 

40 <211> Length z 1660 

SequenceName : SEQ ID 137 
SequenceDescription : 

Sequence 
45 

<213> OrganismName : Mycobacterium tuberculosis H3 7Rv 
<400> PreSequenceString : 

MSFVLVSPET VAAVATDLKR IGASLAHENA SAAASTTAW SAAADEVSTA VAALFSQHAQ 60 

GYQAAAAQVA AFHSRFVQAL TAGAGAYAFA EAANASPLQS AMGAVSASAQ TLLSRPLIGN 120 

50 GANATTPGGN GGDGGWLFGS GGNGAPGAAG QSGGNGGSAG LWGNGGAGGA GGSGGAAGGN 180 

GGNGGWLFGA GGTGGIGGTG APGAMGGTGG NGGNGALLIG GGGLGGAGGM GGTGGGTGGT 240 

GGNGGNGALL IGAGGVGGAG GIGGQGTGAG GAAGAGGTGG NGGAGGLFMN GGDGGAGGQG 3 00 

GDGAAGDAAA SAGGTGGKGG QGGDGGTGGA GGAGPVLFGH GGAGGMGGQG GTGGMGGAGG 3 60 

DGTTVIAAGT GGEGGTGGAA GAGGAAGARG ALTSGGLAGG VGAGGTGGTG GTGGNGADAA 420 

55 AWGFGANGD PGFAGGKGGN GGIGGAA.VTG GVAGDGGTGG KGGTGGAGGA GNDAGSTGNP 480 

GGKGGDGGIG GAGGAGGAAG TGNGGHAGNT GDGGDGGTGG NGGNGTGGVN GADNTLNPDT 540 

PGGAGEPGGA GGAGGAGGAA GGPGGTGGTG GNGGNGGNGG NGGNGGNGGN GGNAGNNSTN 600 

APVGGEGGAG GDGGAGGAG6 AANGGTAGSQ GTGGVGGDGG AGGNGGGGKA GTGMSGNFGV 660 

DGEAGFSGGA GGNGGVGGAA GANGGTGGSG GNGGDGGAGG IGGAGGNGIP GTGTEPAGGT 720 

60 GAKGGDGGDG GAGGAGGNAG GAGGQGGNAG QGGAGGAGGN AVIPGDGVGK APHGDAGGSG 780 

GDGGKGGQGG SGGTGGSGAP IGGGAGGTGG SGGHAGKGGA GGIGAQGTTI TVPGNGGNAG 840 

DGGNGGNAGA GGNGGSGDFG GNTTSGASGS GGNGGNAGTA GSGGAGGTGG TGLSGGNGGN 900 

GGNGGNG6DG GNGAHGTVGA QFVPATSLPT PNGGAGGNGG TGSNGGAPGP AGAPGPTTGG 960 

NAGSQGIGGD GGNGGDG6KG GDGADAVNW FMPTEPQAAT GTAGSAGDPT GGNGGPGTPG 1020 

65 SPMVAPPPPT PITQVQQ6GD GGAGGTGSTN ANDGTATGGK GGEGGVGSIL GGPGGNGGTG 1080 

GNASATGTNTG VANAGNGGKG GDGGQFGAGG NGGAGGSVTD GSAGSTAGNG GNGGNATNGT 1140 

lAGQPAGGNG SAGGKGGDGG NIAAGATGTA GNGGNGGNGN DGAVNAGTGG SGGNGGWAGG 1200 
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GGAMTGGDGGA GGAGGAGGRG 6K6IDGGFGG DGGNGGSNNG TGAGGKTGGNG GTGGVGSVGA 1260 
AGGDGGNGGT GGPAGFGGTA GNGGSGGTGG AGGDGGTGGD GGKTGVIAGGG GTGGNGGASG 1320 
AGQAGGTG6F AGNGNAGGNG GTGGASEDGD NGNAGSGATG GTGGNGGTGG DGGAAGLGGV 13 80 
A 1381 
5 <212> Type : PRT 

<211> Length : 1381 

SequenceName : SEQ ID 138 

SeguenceDescription : 

10 Sequence 



<213> OrganistriUame : Mycobacterium tiiberculosis H3 7Rv 
<400> PreSequenceString : 

MSFVIATPBM LTTAATDLAK IGSTXTAANT AAAAVAKVLP ASADEVSVAV- AALFGTIIAQE <50 

15 YQTVSAQVAT FHDRFVQTLS AAASSYVAAE AVNVEQSLLA AVNAPTQALF GRPLIGNGAD ' 120 

GSPGTGQAGG PGGILYGNGG NGGSGAPGQR GGAGGAAGLI GNGGNGGAGG VGTTGGAGGH 180 

GGAGGWLYGN GGAGGFGGAG AVGGNGGAGG TAGLFGVGGA GGAGGNGIAG VTGTSASTPG 240 

GSGTAGGAGG IGGNGGAGGA GGVLMGNGGN GGAGGEGGPG GAGGAGASGA HATNLGADGQ 3 00 

AGGNGGNGGA GGTGGVGGPG GGHGLLGLGG SHGAGGAGGS GGDGGAPGDG GNGATGTWGH 360 

20 NLGAGGTGGN GGNPGAGGAG GAGGASVGGS AHGANGAPGT TSTSGGNGGD GGKGADAISS 42 0 

GQTGANGGRG GDGGQVGNGG AGGAGGRGGA GGLGFGSEAP GRPGGAGGTG GAGGNGGTQA 48 0 

GDGGTGGAGG AGGDGGSGGA GSIGFNASAP GAAGSPGGNG GNGGPGGAGG EGGAGGLAIiA 540 

ASGQNGSQGA GGDGGAGGNG GTPGNGGHGA AGALGVNGGV GGAGGHGGDP GVGGAGGQGG 60 0 

SGSTPGANGA PGNTPTSGGN GGNGGRGADA TGFGQTGASG GRGGDGGLVG NGGAGGAGGN 660 

25 GSKGLPGLGR LGNPGLDGGT GGWGGAGGSG GAWAGNGGTG GAGGTGGVGG TGGSGSDGVN 72 0 

GSSAGADGHP GGTGGVGGTG GKGGDGGDGG AAPNGVAGSQ GPGGAGGDGG TGGVGGNGGR 780 

GIDGADGATA GARGQDGGAG GAGGKGGRGG TGGPGGAGPA GTTGSQGAGG NGGSGGTGGD 84 0 

PGDGGNGANG SVFTNNGIGG NGGNGGNAGP SGAGGSGGAG STFGATGSSS SIHVNGGNGG 90 0 

NGGNGDHALS GNGAAGGNGG NGGNGSLRGS GGAGGHGGNG GNASRGMGGD GGTGGAGGNA 960 

30 GQXGNGGAGG NGGDGGTGSD GNPGAITGSG GRGGDGGVGG QGGSVAGDGA DGGRGGAGGT 102 0 

GGTGLRGTTG ATGATGTFDA GADGHGGNGG TGGVGGTGGA GGGGGNGGAG GKALSPTGNU 1080 

GSQGAGGDGG AGQAGGTGGT GGDGGRGAHG TLFSSLAGTG GTGGNGGTGG TGGTGGAGGA 1140 

GGTGSTLGAT GATGAAGRAG NGGVGGSGGL GSAFGPGGTG GMGGAGGTST VSAGGDGGRG 1200 

GFGGDGLDAS SGGNGGDGGH GGDGFRTAGA GGRGGDGGKG ADPGGLFPIP GAGGKGGTGG 1260 

35 TGGTAHLGPL AIIGQSGQPG QFGSPGADGR GGAGGAGGGG GAGGSF' 1306 
<212> Type : PRT 
<211> Length. : 1306 

SequenceName : SEQ ZD 139 
Sequenc^escription : 

40 

Sequence 



<213> OrganistnName : f'lycobacterium tuberculosis H37Rv 
<4 00> PreSequenceString : 

45 MSAAAVAWDQ LAMELASAAA SFNSVTSGLV GESWLGPSSA AiyiA?VAVAPYL GWLAAAAAQA 60 

QRSATQAAAL VAEFEAVRAA MVQPALVAAN RSDLVSLVFS NFFGQNAPAI AAIEAAYEQM 120 

WAIDVSVMSA YHAGASAVAS ALTPFTAPPQ NLTDLPAQLA AAPAAWTAA ITSSKGVLAN 180 

LSLGLANSGF GQMGAANLGI LNLGSLNPGG NNFGLGNVGS NNVGLGNTGN" GNIGFGNTGN 240 

GNIGFGLTGD NQQGFGGWNS GTGNIGLFNS GTGNIGIGNT GTGNFGIGNS GTSYNTGIGM 300 

50 TGQANTGFFN AGIANTGIGN TGNYNTGSFN LGSFNTGDFM TGSSNTGFFN PGNLNTGVGN 360 

TGNVNTGGFN SGNYSNGFFW RGDYQGLIGF SGTLTIPAAG LDLNGLGSVG PITIPSITIP 420 

EIGLGINSSG ALVGPINVPP ITVPAIGLGI NSTGALVGPI NIPPITLNSI GLELSAFQVI 480 

NVGSISIPAS PLAIGLFGVN PTVGSIGPGS ISIQLGTPEI PAIPPFFPGF PPDYVTVSGQ 540 

IGPITFLSGG YSLPAIPLGI DVGGGLGPFT VFPDGYSLPA IPLGIDVGGG LGPFTVFPDG 600 

55 YSLPAIPLGI DVGGGLGPFT VFPDGYSLPA IPLGIDVGGA IGPLTTPPIT IPSIPLGIDV 660 

SGSLGPINIP lEIAGTPGFG NSTTTPSSGF FNSGTGGTSG FGNVGSGGSG FWNIAGNLGN 72 0 

SGFLNVGPLT SGILNFGNTV SGLYNTSTLG LATSAFHSGV GNTDSQLAGF MRNAAGGTLF 780 

NFGFANDGTL NLGNANLGDY NVGSGNVGSY NFGSGNIGNG SFGFGNIGSN NFGFGNVGSM 840 

NLGFANTGPG LTEALHNIGF GNIGGNNYGF ANIGNGNIGF GNTGTGNIGI GLTGDNQVGF 900 

60 GALNSGSGNI GFFNSGNOTI GFFNSGNGNV GIGNSGNYNT GLGNVGNANT GLFMTGNVNT 960 

GIGNAGSYNT GSYNAGDTNT GDLNPGNANT GYLNLGDLNT GWGNIGDLNT GALISGSYSN 102 0 

GILWRGDYQG LIGYSDTLSI PAIPLSVEVN GGIGPIWPD ITIPGIPLSL NALGGVGPIV 1080 

VPDITIPGIP LSLNALGGVG PIWPDITIP GIPLSLNALG GVGPIWPDI TIPGIPLSLN 1140 

ALGGVGPIW PDITIPGIPL SLNALGGVGP ITVPGVPISR IPLTINIRIP VNITLNELPF 1200 

65 NVAGIFTGYI GPIPLSTFVL GVTLAGGTLE SGIQGFSVNP FGLNIPLSGA TNAVTIPGFA 1260 

INPFGIiNVPL SG6TSPVTIP GFAINPFGLN VPLSGGTSPV TIPGFTIPGS PLNLTANGGL 1320 

GPINIPINIT SAPGFGNSTT TPSSGFFNSG DGSASGFGNV GPGISGLWNQ VPNALQGGVS 1380 
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GIYWVGQIiAS GVANIiGNTVS GFNNTSTVGH LTAAFNSGVN KTIGQMLLGFP SPGAGP 1436 

<212> Type : PRT 
<211> Length : 1436 
5 SequenceName : SEQ ID 140 

SequenceDescription : 

Sequence 



10 <213> OrganismName : Mycobacterium tuberculosis H37Rv 
<40 0> PreSequenceString : 

MEFPVLPPEI NSVLMYSGAG SSPLIiAAAAA WDGLAEELGS AAVSFGQVTS GLTAGVWQGA 60 

AAAAMAAAAA PYAGWLGSVA AAAEAVAGQA RWVGVFEAA IiAATVDPALV AANRARLVAL 120 

AVSNLLGQUT PAIAAAEAEY ELWAADVAA FJVGYHSGASA AAAALPAFSP PAQALGGGVG 180 

15 AFLTAIiFASP AKALSLNAGL GNVGITYNVGIj GNVGVFNLGA GNVGGQNLGF GMAGGTNVGF 240, 

GNLGNGNVGF GNSGLGAGLA GLGNIGLGNA GSSNYGFANL GVGNIGFGNT GTNNVGVGLT 3 00 

GNHLTGIGGL NSGTGNIGLF NSGTGNVGFF NSGTGMFGVF NSGNYNTGVG NAGTASTGLF 360 

NAGNFNTGW NVGSYNTGSF NAGDTNTGGF NPGGVNTGWL NTGNTNTGIA NSGNVNTGAF 420 

ISGNFNNGVL WVGDYQGLFG VSAGSSIPAI PIGLVLNGDI GPITIQPIPI LPTIPLSIHQ 480 

20 TVNLGPLWP DIVIPAFGGG IGIPINIGPL TITPITLFAQ QTFVNQLPFP TFSLGKITIP 540 

QIQTFDSNGQ LVSFIGPIVI DTTIPGPTNP QIDLTIRWDT PPITLFPNGI SAPDNPLGLL 600 

VSVSISNPGF TIPGFSVPAQ PLPLSIDIEG QIDGFSTPPI TIDRIPLTVG GGVTIGPITI 660 

QGLHIPAAPG VGNTTTAPSS GFFNSGAGGV SGFGNVGAGS SGWWNQAPSA BLGAGSGVGN 720 

VGTLGSGVLN LGSGISGFYN TSVLPFGTPA AVSGIGNLGQ QLSGVSAAGT TLRSMLAGNL ^ 780 

25 GLAISfVGNFNT GFGNVGDVNL GAANIGGHNL GLGNVGDGNL GLGNIGHGNL GFANLGLTAG 840 

AAGVGNVGFG NAGINNYGIiA NMGVGNIGFA NTGTGNIGIG LVGDHRTGIG GLNSGIGMIG 90 0 

LFNSGTGNVG FFNSGTGNFG IGMSGRFNTG IGNSGTASTG LFNAGSFSTG lANTGDYNTG 960 

SFNAGDTNTG GFNPGGINTG WFNTGHANTG IiAtTAGTFGTG AFMTGDYSNG LLWRGGYEGL 102 0 

VGVRVGPTIS QFPVTVHAIG GVGPLHVAPV PVPAVHVEIT DATVGLGPFT VPPISIPSLP 1080 

30 lASITGSVDL AANTISPIRA LDPLAGSIGL FLEPFRLSDP FITIDAFQW AGVLFLENII 1140 

VPGLTVSGQI LVTPTPIPLT LNLDTTPWTL FPNGFTIPAQ TPVTVGMEVA NDGFTFFPGG 120 0 

liTFPRASAGV TGLSVGLDAF TLIiPDGFTLD TVPATFDGTI LIGDIPIPII DVPAVPGFGN 1260 

TTTAPSSGFF NTGGGGGSGF ANVGAGTSGW WNQGHDVLAG AGSGVANAGT LSSGVLNVGS 1320 

GXSGWYNTST LGAGTPAWS GIGNLGQQLS GFLANGTVLN RSPIVNIGWA DVGAFNTGLG 13 80 

35 NVGDLNWGAA NIGAQNIiGLG NLGSGNVGFG NIGAGNVGFA NSGPAVGLAG LGNVGLSMAG 1440 

SNNWGLANLG VGNIGIANTG TGNIGIGLVG DYQTGIGGLN SGSGNIGLFN SGTGNVGFFN 15 00 

TGTGNFGLFN SGSFNTGIGN SGTGSTGLFN AGNFNTGIAN PGSYNTGSFM VGDTNTGGFN 1560 

PGDINTGWFN TGIMNTGTRN TGALMSGTDS NGMLWRGDHE GLFGLSYGIT IPQFPIRITT 1620 

TGGIGPIVIP DTTILPPIiHL QITGDADYSF TVPDIPIPAI HIGINGWTV GFTAPEATLL 1680 

40 SAIiKNNGSFX SFGPITLSMI DIPPMDFTLG liPVLGPITGQ LGPIHLEPIV VAGIGVPLEI 1740 

EPIPLDAISL SESIPIRIPV DIPASVIDGI SMSEWPIDA SVDIPAVTIT GTTISAIPLG 18 00 

FDIRTSAGPIi NIPIIDXPAA PGFGITSTQMP SSGFFNTGAG GGSGIGNLGA GVSGLLNQAG 1860 

AGSLVGTLSG LGWAGTLASG VLNSGTAISG LFNVSTLDAT TPAVISGFSN LGDHMSGVSI 1920 

DGLIAILTFP PAESVFDQII DAAIAELQHL DIGNALALGN VGGWLGLAN VGEFNLGAGN 1980 

45 VGNINVGAGN LGGSNLGLGN VGTGNLGFGN IGAGNFGFGN AGLTAGAGGL GNVGLGNAGS 2040 

GSWGIiANVGV GNIGLANTGT GNIGIGLTGD YRTGIGGLNS GTGNLGLFNS GTGNIGFFNT 210 0 

GTGNFGLFMS GSYSTGVGNA GTASTGLFNA GNFNTGLAWA GSYNTGSLNV GSFNTGGVNP 2160 

GTVNTGWFNT GHTNTGLFNT GNVNTGAFNS GSFNNGALWT GDYHGLVGFS FSIDIAGSTL 2220 

LDLNETLNLG PIHIEQIDIP GMSLFDVHEI VEIGPFTIPQ VDVPAIPLEI HESIHMDPIV 22 80 

50 liVPATTIPAQ TRTIPLDIPA SPGSTMTLPL ISMRFEGEDW ILGSTAAIPN FGDPFPAPTQ 23 40 

GITIHTGPGP GTTGELKISI PGFEIPQIAT TRFLLDWIS GGLPAFTLFA GGLTIPTNAI 24 0 0 

PLTIDASGAL DPITIFPGGY TIDPLPLHIiA LNLTVPDSSI PIIDVPPTPG FGNTTATPSS 2460 

GFFMSGAGGV SGFGNVGSNL SGWWNQAASA LAGSGSGVLN VGTLGSGVLN VGSGVSGIYN 252 0 

TSVLPLGTPA VLSGLGNVGH QLSGVSAAGT ALNQIPILNI GLADVGNFNV GFGNVGDVNL 25 80 

55 GAANLGAQNL GLGNVGTGNL GFANVGHGNI GFGNSGLTAG AAGLGNTGFG NAGSANYGFA 2 64 0 

NQGVRNIGLA NTGTGNIGIG LVGDNLTGIG GLNSGAGNIG LFNSGTGNIG FFNSGTGNFG 2700 

IGNSGSFNTG IGNSGTGSTG LFNAGSFNTG VANAGSYNTG SFNAGDTNTG GFNPGTINTG 2760 

WFNTGHTNTG lANSGNVGTG AFMSGNFSNG LLWRGDHEGL FSLFYSLDVP RITIVDAHLD 2820 

GGFGPWLPP IPVPAVNAHL TGNVAMGAFT IPQIDIPALT PNITGSAAFR IWGSVRIPP 2880 

60 VSVIVEQIIN ASVGAEMRID PFEMWTQGTN GLGITFYSFG SADGSPYATG PLVFGAGTSD 2940 

GSHLTISASS GAFTTPQLET GPITLGFQVP GSVNAITLFP GGLTFPATSL LNLDVTAGAG 30 00 

GVDIPAITWP EIAASADGSV YVLASSIPLI NIPPTPGIGN STITPSSGFF NAGAGGGSGF 3 060 

GNFGAGTSGW WNQAHTALAG AGSGFANVGT LHSGVLNLGS GVSGIYNTST LGVGTPALVS 3120 

GLGNVGHQLS GLLSGGSAVN PVTVLNIGLA NVGSHNAGFG NVGEVNLGAA NLGAHNLGFG 3180 

65 NIGAGNLGFG NIGHGNVGVG NSGLTAGVPG LGNVGLGNAG GNNWGLANVG VGNIGLANTG 3240 

TGNIGIGLTG DYQTGIGGLN SGAGNLGLFN SGAGNVGFFN TGTGNFGLFN SGSFNTGVGN 3300 

SGTGSTGLFN AGSFNTGVAN AGSYNTGSFN VGDTNTGGFN PGSINTGWLN AGNANTGVAN 3360 
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AGNVNTGAFV TGNFSNGILW RGDYQGIiAGF AVGYTLPLFP AVGADVSGGI GPITVLPPIH 3420 

IPPIPVGFAA VGGIGPIAIP DISVPSIHLG LDPAVHVGSI TVNPITVRTP PVLVSYSQGA 3480 

VTSTSGPTSE IWVKPSFFPG IRIAPSSGGG ATSTQGAYFV GPISIPSGTV TFPGFTIPLD 3540 

PIDIGLPVSL TIPGFTIPGG TLIPTLPLGL ALSNGIPPVD IPAIVBDRIL LDLHADTTIG 3600 

5 PINVPIAGFG GAPGFGNSTT LPSSGFFNTG AGGGSGFSNT GAGMSGLLNA MSDPLLGSAS 3 660 

GFANFGTQLS GILNRGAGIS GVYNTGALGV VTAAWSGFG NVGQQLSGLL FTGVGP 3716 

<212> Type : PRT 
<211> Length : 3716 
10 SequenceName : SEQ ID 141 

SequeiiceDescription : 

Sequence 

15 <213> OrganisttiName ; Mycobacterium tuberculosis H37Rv 
<400> PreSequenceString : 

MNFPVLPPEI NSVLMYSGAG SSPLLAAAAA WDGLAEELGS AAVSFGQVTS GLTAGVWQGA 60 

AAAAMAAAAA PYAGWLGSVA AQAVAVAGQA RAAVAAFEAA LAATVDPAAV AVNRMAMRAL 120 

AMSNIiLGQNA AAIAAVEAEY ELMWAADVAA MAGYHSGASA AAAALPAFSP PAQALGGGVG 180 

20 AFLNALFAGP AKMLRLNAGL GNVGNYNVGL GNVGIFNLGA ANVGAQNLGA ANAGSGMFGF 24 0 

GNIGNANFGF GNSGLGLPPG MGNIGIiGNAG SSNYGIiANLG VGNIGFANTG SNNIGIGLTG 3 00 

DNLTGIGGLN SGTGNLGLFN SGTGNIGFFN SGTGNFGVFN SGSYNTGVGN AGTASTGLFN 3 60 

VGGFNTGVAN VGS YNTGS FN AGNTNTGGFN PGEVNTGWLN TGNTNTGIAN SGNVNTGAFI 420 

SGNFSMGVLW RGDYEGLWGL SGGSTIPAIP IGLELNGGVG PITVLPIQIL PTIPLNIHQT 480 

25 FSLGPLWPD IVIPAFGGGT AIPISVGPIT ISPITLFPAQ NFNTTFPVGP FFGLGWNIS 540 

GIEIKDLAGKT VTLQLGNLNI DTRINQSFPV TVNWSTPAVT IFPNGISIPN NPLALLASAS 600 

IGTLGFTIPG FTIPAAPLPL TIDIDGQIDG FSTPPITIDR IPLNLGASVT VGPILINGVN 660 

IPATPGFGNT TTAPSSGFFN SGDGGVSGFG NFGAGSSGWW NQAQTEVAGA GSGFANFGSL 720 

GSGVLNFGSG VSGLYNTGGL PPGTPAWSG IGNVGEQLSG LSSAGTALNQ SLIINLGLAD 780 

30 VGSVNVGFGN VGDFNLGAAN IGDBNVGLGN VGGGNVGFGN IGDANFGLGN AGLAAGLAGV 840 

GNIGLGNAGS GNVGFGNMGV GNIGFGNTGT NNLGIGLTGD NQTGIGGLNS GAGNIGLFNS 900 

GTGNVGIiFNS GTGNFGLFNS GSFNTGIGNG GTGSTGLFMA GNFNTGVANP GSYNTGSFNV 960 

GDTNTGGFNP GS INTGWFNT GNANTGVANS GNVDTGALMS GNFSNGILWR GNFEGLFGLN 102 0 

VGXTIPEFPI HWTSTGGIGP IirPDTTIIiP PIHLGLTGQA NYGFAVPDIP XPAIHIDFDG 1080 

35 AADAGFTAPA TTLLSALGXT GQFRFGPITV SNVQLNPFNV NLKLQFLHDA FPNEFPDPTI 1140 

SVQIQVAIPL TSATLGGLAL PLQQTIDAIE IiPAISFSQSI PIDIPPIDIP ASTINGISMS 12 0 0 

EWPIDVSVD IPAVTITGTR IDPIPIiNFDV LSSAGPINXS IIDXPALPGF GNSTELPSSG 1260 

FFNTGGGGGS GIANFGAGVS GLLNQASSPM VGTLSGLGNA GSLASGVLNS GVDISGMFNV 1320 

STLGSAPAVX SGFGNLGNHV SGVSIDGLIJV MLTSGGSGGS GQPSIXDAAI AELRHLNPLN 13 80 

40 XVNLGNVGSY NLGFAWGDV NLGAGNLGNL UTLGGGNLGGQ NLGLGNLGDG NVGFGNLGHG 144 0 

NVGFGNSGLG ALPGXGNXGL GNAGSNNVGF GlsTMGLGNIGF GNTGTNNIiGX GLTGDNQTGF 1500 

GGLNSGAGNL GLFNSGTGNI GFFNTGTGNW GLFMSGSYNT GIGNSGTGST GLFNAGSFNT 1560 

GLANAGSYNT GSLNAGNTNT GGFNPGNVNT GWFNAGHTNT GGFNTGNVNT GAFNSGSFNN 1620 

GALWTGDHHG LVGFSYSIEI TGSTLVDXNE TLNLGPVHXD QIDXPGMSLP DXHELVNXGP 168 0 

45 FRIEPXDVPA WLDXHETMV XPPXVFLPSM TIGGQTYTXP LDTPPAPAPP PFRLPLLFVN 1740 

ALGDNWIVGA SNSTGMSGGF VTAPTQGILI HTGPSSATTG SLAIiTLPTVT IPTITTSPIP 180 0 

LKIDVSGGLP AFTLFPGGLN IPQNAIPLTX DASGVLDPIT IFPGGFTXDP LPLSIiALNXS 18 60 

VPDSSVPIII VPPTPGFGMA TATPSSGFFN SGAGGVSGFG NFGAGSSGWW NQAHAAIiAGA 192 0 

GSGVLNVGTL NSGVLNVGSG XSGLYNTAIV GLGTPALVSG AGNVGQQLSG VliAAGTALTQ 198 0 

50 SPIXNLGIiAD VGNYNLGLGN VGDFNLGAAN LGDLNLGLGN IGNANVGFGN IGHGNVGFGN 2040 

SGLGAALGIG NIGLGNAGST NVGLANMGVG NXGFANTGTN NLGXGLTGDN QTGIGGLNSG 2100 

AGNXGLFNSG TGNIGFFNSG TGNWGLFNSG SFNTGIGNSG TGSTGLFNAG GFTTGLANAG 2160 

SYNTGSFNVG DTNTGGFNPG SINTGWFNTG NANTGIANSG NVDTGALMSG NFSNGILWRG 2220 

NYEGLFSYSY SLDVPRXTIL DAHFTGAFGP VWPPIPVLA XNAHLTGNAA MGAFTIPQID 22 80 

55 IPALNPNVTG SVGFGPIAVP SVTIPALTAA RAVLDMAASV GATSEIEPFI VWTSSGAXGP 2340 

TWYSVGRIYN AGDLFVGGNX ISGIPTLSTT GPVHAVFNAA SQAFNTPALN IHQIPLGFQV 24 00 

PGSIDAITLF PGGLTFPANS LLNLDVFVGT PGATXPAXTF PEXPANADGE LYVIAGDIPL 2460 

INIPPTPGIG NTTTVPSSGF FNTGAGGGSG FGNFGANMSG WWNQAHTALA GAGSGIANVG 2520 

TLHSGVLNLG SGLSGXYNTS TLPLGTPALV SGLGNVGDHL SGLLASNVGQ NPXTIVNIGL 2580 

60 ANVGNGNVGL GNIGNLNLGA ANIGDVNLGF GNIGDVNLGF GNIGGGNVGF GNIGDANFGF 2640 

GNSGIiAAGLA GMGNIGLGNA GSGNVGWANM GLGNIGFGNT GTNNLGXGLT GDNQSGIGGL 2700 

NSGTGNIGLF NSGTGNIGFF NSGTANFGLF NSGSYNTGIG NSGVASTGLV NAGGFNTGVA 2760 

NAGSYNTGSF NAGDTNTGGF NPGSTNTGWF NTGNANTGVA NAGNVNTGAL ITGNFSNGIL 2820 

WRGNYEGLAG FSFGYPIPLF PAVGADVTGD IGPATIIPPI HIPSXPLGFA ' AIGHXGPISI 2880 

65 PNIAIPSIHL GIDPTFDVGP ITVDPXTLTI PGLSLDAAVS EXRMTSGSSS GFKVRPSFSF 2940 

FAVGPDGMPG GEVSILQPFT VAPINLNPTT LHFPGFTIPT GPIHIGLPLS LTIPGFTIPG 3000 

GTLIPQLPLG LGLSGGTPPF DLPTWIDRI PVELHASTTI GPVSLPIFGF GGAPGFGNDT 3060 
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TAPSSGFFNT GGGGGSGFSM SGSGMSGVLN AISDPLLGSA SGFANFGTQL SGIIiNRGAGI 312 0 
SGVYNTGTLG LVTSAFVSGF MNVGQQLSGIi LFAGTGP 3157 
<212> Type : PRT 
<211> Length : 3157 
5 SequenceName ; SEQ ID 142 

SequenceDe script ion : 

Sequence 



10 <213> OrganistnName : Mycobacterium tuberculosis H3 7Rv 
<400> PreSequenceString : 

MSFWMPPEI NSLIiIYTGAG PGPLIiAAAAA WDELAAELGS AAAAFGSVTS GLVGGIWQGP 60 

SSVAMAAAAA PYAGWLSAAA ASAESAAGQA RAWGVFEAA LAETVDPFVI AANRSRLVSL 12 0 

_ ALSNLFGQNT PAIAAAEFDY ELMWAQDVAA MLGYHTGASA AAEALAPFGS PLASLAAAAE . 180 

15 PAKSLAVNLG LANVGLFNAG SGNVGSYNVG AGNVGSYNVG GGNIGGNNVG LGNVGWGNFG 240 

LGNSGLTPGL MGLGNIGFGN AGSYNFGLAN MGVGNIGFAN TGSGNFGIGL TGDNLTGFGG 3 00 

FNTGSGNVGL FNSGTGNVGF FNSGTGNWGV FNSGSYNTGI GKTSGIASTGL FNAGGFNTGV 3 60 

VNAGSYNTGS FNAGEANTGG FNPGSVNTGW LNTGDINTGV ANSGDVNTGA FISGNYSNGV 42 0 

LWRGDYQGLL GFSSGANVIiP VIPLSLDING GVGAITIEPI HILPDIPINI NETLYLGPLV 48 0 

20 VPPINVPAIS LGVGIPNISI GPIKINPITL WPAQNFNQTI TLAWPVSSIT IPQIQQVALS 540 
*-^PSPIPTTIiIG PIHINTGFSI PVTFSYSTPA LTLFPVGLSI PTGGPLTLTIi GVTAGTEAFT 60 0 

IPGFSIPEQP LPIiAINVIGH INALSTPAIT IDNIPLNIiHA IGGVGPVDIV GGNVPASPGF 660 
GNSTTAPSSG FFNTGAGGVS GFGNVGAHTS GWFNQSTQAM QVLPGTVSGY FNSGTLMSGI 72 0 

GNVGTQLSGM LSGGALGGNN FGLGNIGFDN VGFGNAGSSN FGLANMGIGN IGLANTGNGN 78 0 

25 IGIGLSGDNL TGFGGFNSGS ENVGLFNSGT GNVGFFNSGT GNLGVFNSGS HNTGFFLTGN 840 
NINVXiAPFTP GTLFTISEIP IDLQVIGGIG PIHVQPIDIP AFDIQITGGF IGIREFTLPE 90 0 

ITIPAIPIHV TGTVGLEGFH VNPAFVLFGQ TAMAEITADP WLPDPFITI DHYGPPLGPP 960 

GAKFPSGSFY LSISDLQING PIIGSYGGPG TIPGPFGATF NLSTSSI^ALF PAGLTVPDQT 1020 

PVTVNLTGGL DSITLFPGGLi AFPENPWSL TNFSVGTGGF TVFPQGFTVD RIPVDLHTTL 10 80 

30 SXGPFPFRWD YIPPTPANGP IPAVPGGFGL TSGLFPFHFT LNGGIGPISI PTTTWDALN 1140 

PLIiTVTGNIiE VGPFTVPDIP IPAINFGLDG NVNVSFNAPA TTLLSGLGIT GSIDISGIQI 1200 

•miQTQPAQL FMSVGQTLFIi FDFRDGIELN PIVIPGSSIP ITMAGLSIPL PTVSESIPLN 1260 

FSFGSPASW KSMILHEILP IDVSINLEDA VFIPATVnPA IPLNVDVTIP VGPINIPIIT 132 0 

EPGSGNSTTT TSDPFSGLAV PGLCVGLLGL FDGSIAlJNIiI SGFNSAVGIV GPKTVGLSNLG 138 0 

35 GGtiJVGLGNVG DFNLGAGNVG GFNVGGGNIG GNNVGLGNVG FGNVGLAMSG LTPGLMGLGN 144 0 

IGFGNAGSYN FGLANMGVGN IGFANTGSGN FGIGLTGDNL TGFGGFNTGS GNVGLFNSGT 15 0 0 

GNVGFFNSGT GNWGVFNSGS YNTGIGNSGI ASTGLFNAGG FNTGWNAGS YNTGSFNAGQ 1560 

ANTGGFNPGS VNTGWLNTGD INTGVANSGD VNTGAFISGN YSNGAFWRGD YQGLiLGFSYR 162 0 

PAVLPQTPFIi DLTLTGGLGS WXPAIDIPA IRPEFSANVA IDSFTVPSIP IPQIDLAATT 168 0 

40 VSVGLGPXTV PHLDXPRVPV TLNYLFGSQP GGPLKIGPXT GLFNTPXGLT PLALSQXVXG 174 0 

ASSSQGTXTA FLANLPFSTP WTXDEXPLL ASITGHSEPV DXFPGGLTIP AMNPLSINLS 18 0 0 

GGTGAVTIPA XTIGEXPFDI, VAHSTLGPVH XLIDLPAVPG FGNTTGAPSS GFFNSGAGGV 18 60 

SGFGNVGAMV SGGWNQAPSA LLGGGSGVFN AGTLHSGVLN FGSGMSGLFN TSVLGLGAPA 192 0 

LVSGLGSVGQ QLSGLLASGT ALHQGLVLNF GLADVGLGNV GLGNVGDFNIi GAGNVGGFNV 1980 

45 GGGNXGGNNV GLGNVGWGNF GLGNSGLTPG LMGLGNXGFG . NAGSYNFGLA NMGVGNIGFA 204 0 

NTGSGNFGXG LTGDNLTGFG GFNTGSGNVG LFNSGTGNVG FFNSGTGNWG VFNSGSYNTG 210 0 

IGNSGIASTG LFNAGGFNTG WNAGSYNTG SFNAGQANTG GFNPGSVNTG WLNTGDINTG 2160 

VANSGDVNTG AFXSGNYSNG AFWRGDYQGIi LGFSYTSTIX PEFTVANIHA SGGAGPIXVP 222 0 

SXQFPAIPLD LSATGHXGGF TXPPVSXSPX TVRIDPVFDL GPITVQDITX PALGLDPATG 22 8 0 

50 VTVGPIFSSG SXXDPFSLTL LGFXNVNVPA XQTAPSEXLP FTVLLSSLGV THLTPEITIP 2340 

GFHIPVDPXH VELPLSVTXG PFVSPEXTXP QLPLGLALSG ATPAFAFPLE XTXDRXPWL 2400 

DVNAIiLGPXN AGLVXPPVPG FGNTTAVPSS GFFNXGGGGG LSGFHNLGAG MSGVLNAISD 2460 

PLIiGSASGPA NFGTQLSGIL NR6ADISGVY NTGALGLITS ALVSGFGNVG QQIiAGLIYTG 252 0 

TGP 2523 

55 <212> Type : PRT 

<211> Length : 2523 

SequenceName : SEQ ID 143 
SequenceDe script ion : 

60 Sequence 



<213> OrganisraName : Mycobacterium tuberculosis H3 7Rv 
<4 00> PreSequenceString : 

MSFVXAVPEA LTMAASDLAN IGSTINAANA AAALPTTGW AAAADEVSAA VAALFGSYAQ 60 

65 SYQAFGAQLS AFHAQFVQSL TNGARSYWA EATSAAPLQD LLGWNAPAQ ALLGRPLXGN 120 

GANGADGTGA PGGPGGLLLG NGGNGGSGAP GQPGGAGGDA GLIGNGGTGG KGGDGIiVGSG 180 

AAGGVGGRGG WLLGNGGTGG AGGAAGATLV GGTGGVGGAT GLIGSGGFGG AGGAAAGVGT 240 
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TGGVGGSGGV 
LLIGNGGVGG 
F6SGGAGGQG 
GVGGQGGLGE 
DGATGGVDGG 
GRGGMLIGNG 
VGGTGGMGGS 
GGGATIGGGG 
LGGQGGNGGN 



GGVFGNGGFG 
LGGAGAAGGN 
GTGLAGTNGV 
SliDGNDGTGG 
VGGAGGKGGQ 
GAGGAGGTGG 
GGVGGNGGAA 
GTGGVG6AGG 
GGTGATGGQG 



GAGGLGAAGG 
GGAGGMLLGD 
NPGSIANPNT 
KGGAGGTAGT 
GHNTGVGDAF 
TGGGGAAGFA 
GSIiIGLGGGG 
TGGTGGAG6T 
GDFALGGNGG 



VGGAASYFGT 
GGAGGQGGPA 
6AN6TDNSGN 
DGGAGGAGGA 
GGDGGIGGDG 
GGVGGAGGEG 
GAGGVGGTGG 
TGGSGGAGGL 
AGGAGGSPGG 



GGGGGVGGDG APGGDGGAGP 
VAGVLGGMPG AGGNGGMANW 
GNQTGGNGGP GPAGGVGEAG 
GGIGETDGSA GGVATGGEGG 
NGALGAAGGN GGTGGAGGNG 
LTDGAGTAEG GTGGLGGLGG 
IGGIGGAGGM GGAGGAGTTT 
IGWAGAAGGT GAGGTGGQGG 
SSGIQGNMGP P6TQGADG 



300 
3S0 
420 
480 
540 
600 
660 
720 
778 



10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



<212> Type : PRT 
<211> Length : 778 

SequenceName : SEQ ID 144 
. . SecpjenceDescription : 

Sequence 



<213> Organi smName : Mycobacterium tuberculosis H3 7Rv 
<400> PreSequenceString : 

PQGADGNAGN GGDGGVGGNG GNGADNTTTA AAGTTGGAGG AGGAGGTGGT GGAAGTGTGG 60 

QQGNGGNGGN GGTGGKGGTG GDGAIiAGSSG GAGGKGGNGG DAGKAGTGSA PGTAGTGGDG 12 0 

GKGGNGGIGA AGTTGPVGTG ASGGTGGSGG AGGTGGDGGA ANGGTAGAGG AGGNGGKGGD 18 0 

GGAGVTSSTA GNSGGAGGSG GKGGDAGAGG AGATPGANGI AGNGGDGGDG AAGAVGISGA 24 0 

TGAGDGGHGG TGAAGGNGGT GGAGGSGIDG VGGGTGGTGG NGGNGAIGGA GGDAGGSGNS 3 00 

GGNGGIGGKG GNAGAGGAAG SNGGTVGANG TGGDGGNGGA AGAATAGSNG GAGTGSAGGN 3 60 

GGTGGRGGSG GAGGDGIGGV GGGKGGNGAD GEVGGAGGAG GSGPNTSPGG NGGQGGQGGS 42 0 

GGAGGAAGAG GAGGGANGTA GNGGQGGAGG TGGAGAASSA TNGGSGGAGG TGGDGGSGGA 480 

GGTGGAGGTG GAAGDGGQGG QGGAGGGAGG QGGAGGAGGT GGNGGNITGG TAGTAGAAGN , 54 0 

GGAAGKGGAG GQGGTGGGTG GQGGAGGDGG AGGTGGDRTV GGGTVPAGSG GQGGNAGGGG 600 

AGGQGGADGG SGGDGGDAGT GGNGGNGGNR NSGNGTGGAG GNGGGGANGG AGGAGGSGGG 660 

TGGNGGAGGD AGDAGNGGNG NGTGNGGNGG NGGIAGMGGN GGAGTGSGNG GNGGSGGNGG 720 

NAGMGGNSGT GSGDGGAGGN GGAAGTGGTG GDGGLTGTGG TGGSGGTGGD GGNGGNGADN 78 0 

TAmTAQAGG DGGNGGDGGF GGGAGAGGGG LTAGANGTGG QGGAGGDGGN GAIGGHGPLT 84 0 

DDPGGNGGTG GNGGTGGTGG AGIGSLGGGT GGDGGWGGNG GTGGEGGEVG GAGGTGGAAG 90 0 

HGGDGGTGGT GGGDGGAGGT GGTGGTGGLG DPRVGGSGGD GGTGGSGGAA GNGGNGGNAG 9 60 

AGGNGNGGTG GAGGIGGTGG NGGDAEPGVP PGAGGAGGAG TTGGKGGTGG NGSGTGSG6T 1020 

GGDGGTGGGG GNGGTGWNGG KGDTGSGGGA GDGGKAPAGG TGGAGGDGGA GGKGGSGGV 1079 

<212> Type : PRT 
<211> Length : 1079 

SequenceName : SEQ ID 145 

SequenceDescription : 

Sequence 



<213> OrganisTOName : Mycobacterium tuberculosis H37Rv 
<40 0> PreSequenceString : 

MVMSLMVAPE LVAAAAADLT GIGQAISAAN AAAAGPTTQV LAAAGDEVSA AIAALFGTHA 60 

QEYQALSARV ATFHEQFVRS LTAAGSAYAT AEAANASPLQ ALEQQVLGAI NAPTQLWLGR 12 0 

PLIGDGVHGA PGTGQPGGAG GLLWGNGGNG GSGAAGQVGG PGGAAGLFGN GGSGGSGGAG 180 

AAGGVGGSGG WLNGNGGAGG AGGTGANGGA GGNAWLFGAG GSGGAGTNGG VGGSGGFVYG 240 

NGGAGGIGGI GGIGGNGGDA GLFGNGGAGG AGAAGLPGAA GLNGGDGSDG GNGGTGGNGG 3 00 

RGGLLVGNGG AGGAGGVGGD GGKGGAGDPS FAVNNGAGGN GGHGGNPGVG GAGGAGGLLA 3 60 

GAHGAAGATP TSGGNGGDGG IGATANSPLQ AGGAGGWGGH GGLVGNGGTG GAGGAGHAGS 420 

TGATGTALQP TGGNGTNGGA GGHGGNGGNG GAQHGDGGVG GKGGAGGSGG AGGNGFDAAT 480 

LGSPGADGGM GGNGGKGGDG GKAGDGGAGA AGDVTLAWQ GAGGDGGNGG EVGVGGKGGA 540 

GGVSANPALN GSAGAUGTAP TSGGNGGNGG AGATPTVAGE NGGAGGNGGH GGSVGNGGAG 600 

GAGGNGVAGT GLALNGGNGG NGGIGGNGGS AAGTGGDGGK GGNGGAGANG QDFSASANGA 660 

NGGQGGNGGN GGIGGKGGDA FATFAKAGNG GAGGNGGNVG VAGQGGAGGK GAIPAMKGAT 720 

GADGTAPTSG GDGGNGGNGA SPTVAGGNGG DGGKGGSGGN VGNGGNGGAG GNGAAGQAGT 780 

PGPTSGDSGT SGTDGGAGGN GGAGGAGGTL AGHGGNGGKG GNGGQGGIGG AGERGADGAG 84 0 

PNANGANGEN GGSGGNGGDG GAGGNGGAGG KAQAAGYTDG ATGTGGDGGN GGDGGKAGDG 90 0 

6AGENGLNSG AMLPGGGTVG NPGTGGNGGN GGNAGVG6TG GKAGTGSLTG LDGTDGITPN 960 

GGNGGNGGNG GKGGTAGNGS GAAGGNGGNG GSGUSTGGDAG NGGNGGGALN QAGFFGTGGK 1020 

GGNGGNGGAG MINGGLGGFG GA6GGGAVDV AATTGGAGGN GGAGGFASTG LGGPGGAGGP 1080 

GGAGDPASGV GGVGGAGGDG GAGGVGGPGG QGGIGGEGRT GC^TGGSGGDG GGGISLGGNG 1140 

GLGGNGGVSE TGFGGAGGNG GYGGPGGPEG NGGLGGNGGA GOTGGVSTTG GDGGAGGKGG 1200 
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N"GGDGGNVGL GGDAGSGGAG GNGGIGTDAG GAGGAGGAGG NGGSSKSTTT GNAGSGGAGG 12 60 

N-GGTGLNGAG GAGGAGGNA6 VAGVSFGNAV GGDGGNGGNG GHGGDGTTGG AGGKGGMGSS 1320 
GAASGSGWN VTAGHGGNGG NGGNGGNGSA GAGGQGGAGG SAGNGGHGGG ATGGDGGNGG 1380 
ISrGGITSGNSTG VAGIjAGGAAG AGGNGGGTSS AAGHG6SG6S GGSGTTGGAG AAGGNGGAGA 1440 
5 GGGSLSTGQS GGPRRQRWCR WQRRRWLGRQ RRRRWCRWQR RCRRQRWRWR CRQRRLRRQW 1500 
RQGRRRCRPW LHRRRGRQGR RWRQRRFQQR QRSRWQRR 1538 
<212> Type : PRT 
<211> Length : 153 8 

SequenceName : SEQ ID 146 
10 SequenceDe script ion : 

Sequence 



<213> OrganismName ; Mycobacterium tuberculosis H37Rv^ • 
15 <40 0> PreSequenceString : 

MSFWTAPPV LASAASDLGG lASMISEANA MAAVRTTALA PAAADEVSAA lAALFSSYAR 60 

DYQTLSVQVT AFHVQFAQTL TNAGQLYAW DVGNGVIiLKT EQQVLGVINA PTQTLVGRPL 12 0 

IGDGTHGAPG TGQNGGAGGI LWGNGGNGGS GAPGQPGGRG GDAGLFGHGG HGGVGGPGIA 180 

GAAGTAGLPG GNGANGGSGG IGGAGGAGGN GGLLFGNGGA GGQGGSGGLG GSGGTGGAGM 240 

20 AAGPAGGTGG IGGIGGIGGA GGVGGHGSAL FGHGGINGDG GTGGMGGQGG AGGNGWAAEG 3 00 

ITVGIGEQGG QGGDGGAGGA GGIGGSAGGI GGSQGAGGHG GDGGQGGAGG SGGVGGGGAG 3 60 

AGGDGGAGGI GGTGGNGSIG GAAGNGGNGG RGGAGGMATA GSDGGNGGGG GNGGVGVGSA 420 

GGAGGTGGDG GAAGAGGAPG HGYFQQPAPQ GLPIGTGGTG GEGGAGGAGG DGGQGDIGFD 480 

GGRGGDGGPG GGGGAGGDGS GTFNAQANNG GDGGAGGVGG AGGTGGTGGV GADGGRGGDS 540 

25 GRGGDGGNAG HGGAAQFSGR GAYGGEGGSG GAGGNAGGAG TGGTAGSGGA GGFGGNGADG 60 0 

GISTGGNGGNGG FGGINGTFGT MGAGGTGGLG TLLGGHNGMI GLNGATGGIG STTLTNATVP 660 

LQIiVNTTEPV VFISLNGGQM VPVLLDTGST GLVMDSQFLT QNFGPVIGTG TAGYAGGLTY 720 

MYNTYSTTVD FGNGLLTLPT SVNWTSSSP GTLGNFLSRS GAVGVLGIGP MNGFPGTSSI 780 

VXAMPGLLNN GVLIDESAGI LQFGPNTLTG GITISGAPIS TVAVQIDNGP LQQAPVMFDS 840 

30 GGINGTIPSA LASLPSGGFV PAGTTISVYT SDGQTLLYSY TTTATNTPFV TSGGVMNTGH 900 

V^FAQQPIYV SYSPTAIGTT TFH 923 
<2:X2> Type z PRT 
<211> Length r &23 

SequenceName : SEQ ID 147 



35 SecjuenceDescription r 

Sequence 



<^13> OrganisrriMame r Mycobacterium tuberculosis H37Rv 

40 <4:00> PreSequenceString : 

MXGNGGAGGS GAPGAIGGAG GPAGLIGVGG AGGAGGDSAV AGVIGGAGGA GGAALLFGAG 60 

GAGGAGGSGG SGAAGGAGGA GGAGGLFASG GSGGFGGFAS TGTGGAGGTG GAGGLFASGG 120 

VGGTGGGAGS GGTGGVGGTG GAGGLFASGG AGGAGGSGGT GGAGGTGGAG GLFGAGGAGG 180 

LGGQGNHTGG HGGAGGSAGL LALGDGGAGG AGGAATTGTG GAGGAGGKAG LLFGSGGAGG 240 

45 SGGAAGTFGD TGNSGGAGGA GGKAGLLFGS GGAGGSGGAG GFANGSTGGA GGAGGGAGLI 30 0 

GISTGGNGGSGG TSVATGGAGN GGAGGAGGGA GLIGNGGNGG SGGMGDAPGG TGVGGIGGLL 3 60 

LGLDGANAPA STNPLHTAQQ QALAAVNAPI QAVTGRPLIG NGANGAPGSG APGGHGGWLF 420 

GGGGTGGSGV SGGAGGDGGA GGILFGAGGA GGAGGAVTGT GATGGSGGAG GGALLFGAGG 480 

AGGAGGSSGI GGFAAGGAGG PGGAGGLFNG GGAGGAGGSG VSGGAGGEGG AGGAGGLFAG 540 

50 GGAGGAGGSG NNVGGAGGAG GVGGLFGAGG AGGSGGGGSV AGDSGAGGNA GLLAPGLAGG 600 

AGGGGGQGFD TGGAGGPGGD AGLLVGSGGV GGAGGFGLTT GGPGAAGGDA GLLFGSGGAG 660 

GAGGSGRTDL GGAGGAGGKA GLIGNGGNGG AGGAGGNGGG DGGPGGAAFG LGNGGNGGNG 720 

GTGTSAGSPG AGGAGGSLIG AE6LPGLLP 749 
<212> Type : PRT 

55 <211> Length : 749 

SequenceName : SEQ ID 148 
SequenceDescription : 



Sequence 
60 

<213> OrganismName : Mycobacterium tuberculosis H37Rv 
<400> PreSequenceString : 

MSFVIAAPEA LVAVASDLAG IGSALAEANA AALAPTTALL AAGADEVSAA lAALFGAHGQ 60 

AYQTVSAQAS AFHAQFVQAL TGGGGAYAAA EAANVSAAQS TDQRLLDLIN GPTQALLGRP 12 0 

65 LXGDGANGGP GQDGGPGGLL YGNGGNGGTS TTAGVAGGNG GAAGLIGNGG AGGGGGAGAA 18 0 

GGNGGAGGWL YGNGGAGGAG GTSVIPGVAG GNGGAGGSAG LWGTGGAGGD GGNGRSGPVN 240 

VA.GSAGGNGG AGGAAGLFGD AGAG6NGGKG GAGGAAFSIN FTAGDGGAGG AGGSGGHALL 300 
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WGAGGAGGNG GSGGTGGAGG STAGAGGNGG AGGGGGTGGL LFGNGGAGGH GAAAGNGIiAA 3 60 

GNGVSSSGGG GAGGTGGAGG DGGAGGAGGN ARLWGVGGAG GAGGDGGAGG AGGKGGSGLS 420 

GNANGGAGGD SGRGGTGGAG GEGGAAGLLV GTGGHG6DGG AGGAAVKGGD GGAAAGTGIA 48 0 

GAGGRGGAGG SGGSGGDGGG GAAGPAGWLF GDGGAGGNGG AAAAGGAGGQ AGGGGGMGGN 540 

5 GGNGGNGGNG GNGATGGWLY GNGGAGGQGA TAGAGGAGAN GVSSTNGGGT GGNGGIGGTG 60 0 

GSGGAGGNAG LLGVGGAGGH GASGGAGDRG GAGGTGFISS DGGAGGDGGD GGNGGAGGTG 660 

GLLFGAGGNG GPGGSGGAAD IGGNGGAGNG GGTDGNGGNG GSGGGAGSGG DGGGAGGNGA 720 

WLFGNGGAGG GGGKGGHGAG GGLGGGSFGL PGLNGSGGDG GDGGNGAPGG VLYGNGGAGG 780 

QGSSGGIGGP GATGGAGGKG 6DGGDAQLIG DGGNGGNGGA GGTGGTPGPG GPGGSGGLGG 840 

10 LLFGQTGTAG VSP 853 



<212> Type : PRT 

<211> Length : 853 

SequenceName : SEQ ID 149 
SequenceDescription : 

15 

Sequence 



<213> OrganisttiNarae : Mycobacterium tioberculosis H3 7Rv 
<4 00> PreSequenceString : 

20 MSYLVWPEL VAAAATDLAN IGSSISAANA AAAAPTTALV AAGGDEVSAA lAALFGAHAR 60 

AYQALSAQAA MFHEQFVRAL AAGGNSYAVA EAATAQSVQQ DLLNLINAPT QALLGRPLIG 120 

NGANGIiPGTG QNGGDGGILY GNGGNGGSGG VNQAGGNGGM AGLWGNGGSG GAGGNATTAG 180 

RMGFNGGAGG SGGLLWGNGG AGGAGGNGGP APLVGGVGTT GGAGGNGGGA GLFYGFGGAG 24 0 

GNGGMGGVAP STGPSMGILP AGGVGGPGGS GGASALAFGS GGVGGAGGLG GPTDGTVQGV 3 00 

25 GGFGGQGGNG GQSGLLFGMA GAGGAGAAGG AGTGDTESFG GHGGAGGDGG AVGLIGNGGA 3 60 

GGTGSPGAW GGNGGVGGLG GAGSPGGLLY GTGGAGGNGG PGGDGGTGAT VGFAGSGGFG 420 

GAGGIAQLFG TGGMGGSGGG IGAGTTTWP PDVAPVGGTG GNGGRAGLLL GVGGMGGNGG 480 

ATSVGGTLYA AGGNGGDGGL VWGNGGTGGS GGAGGAGSVG NGGAGGNAAL LFGNGGAGGA 540 

GGAGGIGAGG AGGFGAVLFG NGGAGGSGAP GGIGAGGNGG NALLVGNGGKT GGAGTGGAAG 600 

30 GAGGSGGLIiF GQNGMPGP 618 



<212> Type : PRT 

<211> Length. : 618 

SequenceName : SEQ ID 150 
SequenceDescription : 

35 

Sequence 



<213> OrganismName : Mycobacterium tuberculosis H37Rv 
<400> PreSequenceString : 

40 MNFSVLPPEI NSALIFAGAG PEPMAAAATA WDGLAMELAS AAASFGSVTS GLVGGAWQGA 60 

SSSAMAAAAA PYAAWLAAAA VQAEQTAAQA AAMIAEFEAV KTAWQPMLV AANRADLVSL 120 

VMSNLFGQNA PAIAAIEATY EQMWAADVSA MSAYHAGASA lASALSPFSK PLQNLAGLPA 18 0 

WLASGAPAAA MTAAAGIPAL AGGPTAINLG lANVGGGNVG NANNGLANIG NANLGNYNFG 240 

SGNFGNSNIG SASLGlSnsnSTIG FGNLGSNNVG VGNLGNLNTG FANTGLGNFG FGNTGNNNIG 3 00 

45 IGLTGNNQIG IGGLNSGTGN FGLFNSGSGN VGFFMSGNGN FGIGNSGNFN" TGGWMSGHGN 3 60 

TGFFNAGSFN TGMLDVGNAN TGSLNTGSYN MGDFNPGSSN TGTFNTGNAN TGFLNAGNIN 420 

TGVFNIGHMN NGLFNTGDMN NGVFYRGVGQ GSLQFSITTP DLTLPPLQIP GISVPAFSLP 480 

AITLPSLNIP AATTPANITV GAFSLPGLTL PSLNIPAATT PANITVGAFS LPGLTLPSLN 540 

IPAATTPANI TVGAFSLPGL TLPSLNIPAA TTPANITVGA FSLPGLTLPS LNIPAATTPA 600 

50 NITVGAFSLP GLTLPSLNIP AATTPANITV SGFQLPPLSI PSVAIPPVTV PPITVGAFNL 660 

PPLQIPEVTI PQLTIPAGIT IGGFSLPAIH TQPITVGQIG VGQFGLPSIG WDVFLSTPRI 720 

TVPAFGIPFT LQFQTNVPAL QPPGGGLSTF TNGALIFGEF DLPQLWHPY TLTGPIVIGS 780 

FFLPAFNIPG IDVPAINVDG FTLPQITTPA ITTPEFAIPP IGVGGFTLPQ ITTQEIITPE 840 

LTINSIGVGG FTLPQITTPP ITTPPLTIDP INLTGFTLPQ ITTPPITTPP LTIDPINLTG 900 

55 FTLPQITTPP ITTPPLTIEP IGVGGFTTPP LTVPGIHLPS TTIGAFAIPG GPGYFNSSTA 960 

PSSGFFNSGA GGNSGFGNNG SGLSGWFNTN PAGLLGGSGY QNFGGLSSGF SNLGSGVSGF 1020 

ANRGILPFSV ASWSGFANI GTNLAGFFQG TTS 1053 
<212> Type : PRT 
<211> Length : 1053 

60 SequenceName : SEQ ID 151 

Sec[uenceDe script ion : 



Sequence 



65 <213> OrganismName : Mycobacterium tuberculosis H37Rv 
<400> PreSequenceString : 

MLYWASPDL MTAAATNLAE IGSAISTANG AAALPTVBW AAAADEVSTQ lAALFGAHAR 60 
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SYQTLSTQAA AFHSRFVQAL TTAAASYASV 
GADGSTPGQA GGPGGLLYGN GGNGAAGGPN 
GTGGLLFGNG GAGGQGGLGL AGINGGSGGQ 
PTPIGTAAPG SDGVNQIGNG GNTDLTGGAG 
5 SFGGAGGAGG DGANGGDGGA GGEALTEGGA 
ATTSVTGGNG GNGGNGHDSN APGGAGGSGG 
APGGAGGAGG KADIANSLGD NATVTGGNGG 
IGMGGAGGAG GLGGAGGAGG AGGEGGAGGA 
GAGGAP6LGG AGGAGGWLIG QSGSTGGGGA 
10 SSGTAGFDGN PGQPG 
<212> Type : PRT 
<211> Length : 615 

SequenceName : SEQ ID 152 
SequeiiceDesc2?iption : 

15 

Sequence 



<213> Organi smNarae : Mycobacterium tuberculosis H3 7Rv 
<40 0> PreSequenceString : 

20 MHYSVLPPEI NSALIFAGAG SGPMIiAAASA WDGLATELAS AAVSFGSVTA GLVGGSWQGR 60 

SSVAMAAAAA PYAGWLAAAA TQAEQAATQA QVMVAEFEAV RLAMVQPALV AANRSGLISL 12 0 

VISNLFGQNA PAIAAAEAAY EEMWALDVSA MAAYHSGASA VAVALPAFAL PLRLPAGLAA 180 

GPAAWTALT TAVGMPTFAG RAIAASLGIiA NVGGGNLGNA NNGLGNIGNA NLGNNNLGSG 24 0 

NFGSFIsriGSA NLGGNNIGIG NAGANNFGIiA NLGNLNTGFA NAGIGMFGIA NTGKTNNIGNG 3 00 

25 LTGNNQIGIG GLNSGNGNVG LiFNAGSANIG FFNSGNGMFG IGNSGNFSTG LFNPGHGNTG 360 

FLNAGSFNTG MFDVGNANTG SFNVGHYNFG AFNPGPSNTG TFNTGGANTG WFNTGSINTG 420 

AFNIGDMNNG LFNTGDMISING VFYRGVGQGS LQFAITSPDL TLiPSLEIPGI SVPAFSLPAI 480 

TlaPSIiTIPAV TTPANVTVGA EDLPGLTVPS LTIPAAMTPA NITVGAFDLP GLTVPSLTIP 540 

ATTTPANITV GAFiTLPQLSI PSVTVPPITI PAGTALGAFN LPTLSIPSVT VPPITIPAGT 600 

30 TVGGFTLPTI HTPLISTPQI SIGGFSTPGI ATQANSGVIN LPTFSIiNGIT ITNLWFIPN 660 

NITALQTIJMP GVFPQIGGFA NTPPAFINTG TITVGGGQIN GVGFSIGAIN VTPFTLPNW 720 

IQPWSLGGIS VDGFTLPEIS TQEFTTPALT ISPIGVGALS LPDITTQQFT TPELTIDPIT 780 

LGGFTLPQLS IPAITTPAFT IDPIALGGFT LPQIMTPEIT TPPFAIDPIG LSGFTLPQVN 840 

IPEITTPEFT rQPVGIiAAFT TPALTIASIH LPSTTMGGFA IPAGPGYFNS SATPSLGFFN 900 

35 AGIGGNSGFG NSGSGLSGWF NTSPVGLLAG SGYQNyGGIil SGFSNIiGSGI SGFAKTGTLP 960 

FAVTSLVSGL ANIGNNLSGIi FFQSTTP 987 
<212> Type z PRT 
<211> Length : 987 

SequenceKTame : SEQ ID 153 

40 SecjuenceDescription : 

Sequence 



<213> OrganismName : Mycobacterium tuberculosis H3 7Rv 
45 <400> PreSequenceString : 

MSFVWAPEV LAAAASDLAG IGSTLAQANA 

AYQAVSAQMS AFHAQFMQAL TGAGGAYAAA 

LIGDGANGGP GQDGGPGGLL YGNGGNGGTS 

GGNGGAGGWL FGNGGAGGAG GLGVAPGVPG 
50 VAGAGGFEGT IGAGGAGGVG GAGGVGGAGG 
.. GAGGAGGAGG VGGAGGAAGL WGGGGAGGVG 

GAGGAGGAGG TGGWLYGGGG AAGSGGDGGT 

GGVGGAG6RA GLFGVGGLGG AGGDAGDSGE 

GGAGGAGGNG GAGGNGGWLF GNGGAGGSGG 
55 VGIGGAGGAG GTAGLFGDGG AGGAGGAGAA 

LLGTGGAGGV GGGGGAGGDG GRGGVATPGG 

GGAGGAGGNG GNGGKAGFSP GPTNFGLNGA 

AAGGHGGDAQ LIGNGGHGGA GGTGVPNGSG 

<212> Type : PRT 
60 <211> Length : 767 

SequenceName : SEQ ID 154 
SequenceDe script ion : 

Sequence 

65 

<213> OrganismName : Mycobacterium tuberculosis H37Rv 
<400> PreSequenceString : 



EAANASPLQV ALDVINAPAQ 
QAG6AGGNAG LIGNGGAGGA 
GGHGGNAILF GQGGAGGPGG 
GDGNAGSTTV NGGNGGTGGA 
TAVSGAGGKG GNAEASGGAG 
VGGDGGRGGL LAGNGGTGGA 
TGGDGGSALG TGGAGGAGGL 
GGEAIPGGAS TNSAGGDGGA 
GGAGGAGGAG GAGGSGGAG6 



TLLGRPLIGN 120 

GGVGAVGGKR 18 0 

TGAMGVAGTN 24 0 

ARNSSGGTGN 3 00 

GNGGKG6FAQ 3 60 

GGNGGTGGAG 42 0 

GGHGGAGGLL 48 0 

GGTGGNGGDG 54 0 

H6DTTSGKN6 600 
615 



AALAPTTAVL AAGADEVSAA lASLFGAHGQ 60 

EAVNVSAAQS VEQDLLAAIN ARFERIFGRP 12 0 

TTVGMAGGNG GAAGLIGNGG FGGGGGPGAA 180 

GAGGAGGAGG VGGPAGLWGH GGAGGAGGAG 240 

AGGWLYGDAG AGGDGGVGGA GGTGGLGNRG 30 0 

GTGGGAGLGA QSVTFSSSLS GLSGGDGGAG 3 60 

GGQGGAGGAG VFSLFGSGGG PGGNGGVGGV 420 

GGFGGPGLAG GLFGNPGNGG VGGIGGDAAA 480 

DGGAAGRGGA GNLGSAGGIN APAGNPGSGS 540 

GGFGGISAAT PSAGSEGAMG GAGGVGGNAR 600 

QGGDAGDGGA GGAGGNGGGA SGAGGWLLGT 660 

GGGGGVGGNG ATGPWLFGDG GPTPGSTGAG 72 0 

GAGGLSGLLF GEPGANG 767 
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MSFVIANPEM XjAAAATDLAG IRSAISAATA AAAAPTIQVA AAGADKVSLA ISALFGQHAQ 60 

AYQALSAQAT XFHDQFVQAL TSG6NLYAAA ESHTVBQMVL NAINAPTQTL FGRPLIGDGA 120 

NGTAENPDGQ JSTGGLLFGNGG NGFTQTTAGV AGGNGGSA6L IcaiGGAGGGG GAGAAGGLGG 180 

NGGWLYGNG6 AGGIGGAGTG TGGHGGAGGA GGRAWLWGTG 6AGGAGGDGG WLFGDGGAGG 240 

5 TGGNGGSGFN SLTSSVGGAG GAG6HAGLFG AGGTGGTGGI GGQNTBTGPA ASNGGAGGAG 3 00 

GGGGYLVGDG GAGGTGGAGG KNSSGGATLT GGTGGTGGAG QAAGWLYGSG GAGGAGGAGG 360 

LNNAGGATG6 XGGTGGAGGS GAWIiYGNGGA AQAGGNGGNN TSAGTGGVGA SGGTGGNAGL 420 

IGAGGHGGAG C3AGGNQTGGV GNGGAGGNGG AGGAGGQLYG NGGDGGNGGA GGANIAGGNG 480 

SDGGAAGHGG AGGSARLIGA GGHGGDGGAG GNTAGRRADA lAGTGGDGGN GGNGGLLSGN 540 

10 AGAGGHGGA6 GSSTATTTTG TPPTGATGCaST GGNGGAGGTA GFTGSGGIGG NGGAGGTGGN 600 

AGVALSVGST GGLGOtTGGSG GLGGGGGSLF GNGQAGGVGA TGGNGGSGIG PASVGGNGGK 660 

GGVGAAGGLA GQIGNGGSGG SGGAGGNGGT GDTAGNGGNG GAGAVGGNAQ LIGNGGNGGG 720 

GGNGGTGAD6 T 731 
<212> Type : ^PRT 

15 <211> Length : 731 

SequenceName : SEQ ID 155 
SequenceDescription : 



Seqpience 
20 

<213> OrganisraName : Mycobacterium tiiberculosis H37Rv 

<400> PreSecjuenceString : 

MPGRFRNFGS QIJLGSGNIGS TNVGSGNIGS TNVGSGNIGD TNFGNGNNGN FNFGSGNTGS 60 

NNIGFGNTGS GNFGFGNTGN NNIGIGLTGD GQIGIGGLNS GSGNIGFGNS GTGNVGLFNS 12 0 

25 GTGNVGFGNS GTANTGFGNA GNVNTGFWNG GSTNTGLANA GAGNTGFFDA GNYNFGSLNA 18 0 

GMINSSFGNS GDGNSGFLNA GDVNSGVGNA GDVNTGLGMS GNINTGGFNP GTLNTGFFSA 240 

MTQAGPNSGF FNAGTGNSGF GHNDPAGSGN SGIQNSGFGN SGYVNTSTTS MFGGNSGVLN 3 00 

TGYGNSGFYN AAWNTGIFV TGVMSSGFFN FGTGNSGIiLV SGNGLSGFFK NLFG 354 

30 <212> Type = PRT 

<211> Length : 354 

SequexxceBTame : SEQ ID 156 
SequerxceDescription r 

35 Sequence 



<213> OrgarL±smName : Mycobacterium tuberculosis H37Rv 
<400> PreSec^uenceString t 

MSFVLAMPEV XiGSAATDLAA LGSVLGAADA AAAATTTGIV AAAQDEVSAA lAALFSAHGR 60 

40 AYQVASAQAA AVHAQFVEAL SAGAGAYASA EAAGAAVLAN PAQSVQQDLL AAVNAQSVAL 120 

TGRPLIGNGA IsTGAPGTGANG APGGWLLGNG GAGGSAAAGS GLPGGAGGAA GLFGTGGAGG 18 0 

AGGSSTVGDG EAGGAGGSGG WLLGTGGVGG VGGLGAGAGG AGGVGGAGGL LGAGGHGGAG 240 

GLGAVTGGVG GTGGAGGLLA GLLAGPGGAG GTGGRGFLNN GGVGGAGGNA GLLFGAGGTG 3 00 

GSGGAGLGGD GGAGGAGGNT GVLFGMAGSG GTGGFGDTDG GAGGAGGDAG WLGSGGVGGA 3 60 

45 GGFGETGDGG VGGAGGKAGL LIGNGGAGGA GGQGAVTGGT GGAGGDGVLI GNGGNAGIGG 420 

TGPTAGDTGA GGISGLLLGA DGFNTPASAS PLHTLKQQAL AAINAPTQTL TGRPLIGNGT 480 

PGAVGSGATG APGGWLLGDG GAGGSGAAGS GAPGGAGGAA GLWGTGGAGG AGGSSAGGGG 540 

AGGAGGAGGW LLGDGGAGGI GGASTVLGGT GGGGGVGGLW GAGGAGGAGG TGLVGGDGGA 600 

GGAGGTGGLL AGLIGAGGGH GGTGGLSTNG DGGVGGAGGN AGMLAGPGGA GGAGGDGENL 660 

50 DTGGDGGAGG SAGLLFGSGG AGGAGGFGFL GGDGGAGGNA GLLLSSGGAG GFGGFGTAGG 720 

VGGAGGNAGW LGFGGAGGVG GSAGLIGTGG NGGNGGTGAN AGSPGTGGAG GLLLGQNGLN 78 0 

GLP 783 
<212> Type = PRT 
<211> Lengtli : 783 

55 SequenceName : SEQ ID 157 

SequenceDescription : 



Sequence 



60 <213> Organ±sraName : Mycobacterium tuberculosis H37Rv 
<400> PreSequenceString : 

MSLVIATPQL LATAALDLAS I6SQVSAANA AAAMPTTEW AAAADEVSAA lAGLFGAHAR 60 

QYQALSVQVA AFHEQFVQAL TAAAGRYAST EAAVERSLLG AVNAPTEALL GRPLIGNGAD 120 

GTAPGQPGAA GGLLFGNGGN GAAGGFGQTG GSGGAA6LIG NGGNGGAGGT GAAGGAGGNG 180 

65 GWLWGNGGNG GVGGTSVAAG IGGAGGNGGN AGLFGHGGAG GTGGAGLAGA NGVNPTPGPA 24 0 

ASTGDSPADV SGI6DQTGGD GGTGGHGTAG TPTGGTGGDG ATATAGSGKA TGGAGGDGGT 300 

AAAGGGGGNG GDGGVAQGDI ASAFGGDGGN GSDGVAAGSG GGSGGAGGGA FVHIATATST 360 
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GGSGGFGGMG AASAASGADG GAGGAGGNGG AGGLLFGDGG NGGAGGAGGI GGDGATGGPG 420 
GSGGNAGIAR FDSPDPEAEP DWGGKGGDG 6KGGSGLGVG GAGGTGGAGG NGGAGGLLFG 480 
NGGNGGNAGA GGDGGAGVAG GVGGNGGGGG TATFHEDPVA GWAVGGVGG DGGSG6SSLG 540 
VGGVGGAGGV GGKGGASGML IGNGGNGGSG GVGGAGGVGG AGGDGGNGGS GGNASTFGDE SCO 
5 NSIGGAGGTG GNGGNGANGG NGGAGGIAGG AGGSGGFLSG AAGVSGADGI GGAGGAGGAG 660 
GAGGSGGEAG AGGLTNGPGS PGVSGTEGMA GAPG 694 
<212> Type : PRT 
<211> Length : 694 

SequenceName : SEQ ID 158 
10 SecjuenceUescription : 

Sequence 



<2±3> OrganisinName : Mycobacterium tuberculosis H3 7Rv 
15 <40 0> PreSequenceString : 

MSFVIAAPEV lAAAATDLAS LESSIAAANA 

AYQALSAQAQ AFHAQFVQAL TSGGGAYAAA 

GAPGTGANGG DGGWLIGNGG AGGSGAAGVN 

GAGGAGGAAG MLFGAAGVGG PGGFAAAFGA 
20 GGAGGSGGNG GLFGAGGTGG PGGFGIFGGG 

GGAGGDAGML SLGAAGGAGG SGGSNPDGGG 

AGGAGGKAGL LIGAGGAGGA GGGSFAGAGG 

GGAGGSGVLI GNGGNGGSGG TGAPAGTAGA 

AIMEPTQALT GRPLIGNGAN GTPGTGADGG 
25 GILSGIGGTG GSGGIGTTGQ GGTGGTGGAA 

FLGAAGTGGQ AALSQNFIGA GGTAGAGGTG 

GTGGAGTLGA DGGAGGHGGL FGAGGTGGAG 

SGGSALNVGG TGGVGGNGGS GGSLFGFGGA 

GGAGGFGADT GGNSSSVPNA VLIGNGC^TGG 

30 

<212> Type : PRT 
<21X> Length : 837 

SequenceKTame r SEQ ID 159 

SequenceDescription i 

35 

Seqtxence 



<213> OrganisraMaine : Mycobacterium tuberculosis H37Rv 

<40O> PreSequenceString : 
40 MSFVIAVPET lAAAATDLAD LGSTIAGAWA 

AYQAASAEAA AFHGRFVQAL TTGGGAYAAA 

GAPGTGANGG DAGWLIGNGG AGGSGAKGAN 

IGGAGGAGGS AMLFGAGGAG GAGGAATSLV 

STAGGAGGAG GAGGLFTTGG VGGAGGQGHT 
45 GTGGAGGDGG GGGLFGAGGD GGAGGSGLTT 

VFGGGKGGAG GAGGNAGMLF GSGGGGGTGG 

GGPAGTAAGG AGGAGGAPGL IGNGGNGGNG 

AGKSGFGGFG GLLLGADGYN APESTSPWHN 

GTGDDGGAGG WLFGNGGNGG AGAAGTNGSA 
50 GGAGGSAFLI GSGGTGGVGG AATTTGGVGG 

AGGTGGAAGL FANGGAGGAG GTGSTAGGAG 

GAGGPGGLYG AGGSGGAGGH GGMAGGGGGV 

GAGGSAGLFY GSGGAGGNGG YSLNGTGGDG 

GNGGAGGKAG LYGNGGDGGA GGD6ATSGKG 
55 GGLVLGRDGQ HGLT 

<212> Type : PRT 

<211> Length : 914 

SequenceName : SEQ ID 160 
SequenceDescription : 

60 

Sequence 



<213> OrganismNatne : Mycobacterium tuberculosis H37Rv 
<400> PreSequenceString : 
65 MSLVIVTPET VAAAASDVAR IGSSIGVANS AAAGSTTSVL AAGADEVSAA lATLFGSHAR 60 
EYQAISTQVA APHDRFAQTL SAAVGSYVSA EATNAAPLAT LEHNVLNALN APTQALLGRP 120 
LIGDGAAGAP GTGQAGGAGG ILWGNGGAGG SGAPGQVGGA GGAAGLFGTG GAGGAGGAGA 180 



AAAANTTALL AAGADEVSTA 
EAAATSPLLA PINEFFLANT 
GGAGGNGGAG GLIGNGGAGG 
TGGAGGAGGN GGLFADGGVG 
AGGDGGSGGL FGAGGTGGSG 
GAGGIGGDGG TLFGSGGAGG 
TGGAGGAPGIi VGMAGNGGNG 
GGLGGQLLGR DGFNAPASTP 
AGGWLFGNGG NGGHGATGAD 
LLIGSGGTGG SGGFGLDTGG 
GLFANGGAGG AGGFGANGGT 
GSSGGTFGGN GGSGGNAGLL 
GGTGGSSGIG SSGGTGGDGG 
NGGKAGGTPG AGGTSGLIIG 



VAALFGAHGQ 60 

GRPLIGNGTM 12 0 

AGGRASTGTG 180 

GAGGATDAGT 240 

GTSIINVGGN 3 00 

VCGLGFDAGG 3 60 

GASANGAGAA 420 

LHTLQQQILN 480 

GGDGGSGGAG 540 

AGGRGGDAGL 600 

GGNGIiLFGAG 660 

ALGASGGAGG 720 

TAGVFGNGGD 780 

ENGLNGL 837 



AAAANTTSLL AAGADEISAA 
EAAAVTPLLN SINAPVIiAAT 
GGAGGPGGAA GLFGNGGAGG 
GGIGGTGGTG GNAGMLAGAA 
GGAGGAGGAG GLFGAGGMGG 
GGAAGNGGNA GTLSLGAAGG 
FGFAAGGQGG VGGSAGMLSG 
GESGGTGGVG GAGGNAVLIG 
LQQDILSFIN EPTEALTGRP 
GGAGGAGGIL FGTGGAGGAG 
AGGNAGLLIG AAGLGGCGGG 
GAGGAGGLYA HGGTGGPGGN 
GGNAGSLTLN ASGGAGGSGG 
GTGGAGQITG LRSGFGGAGG 
GAGGNAWIG NGGNGGNAGK 



lAALFGAHGR 60 

GRPLIGNGAN 120 

AGGTATANNG 180 

GAGGAGGFSF 240 

AGGFGDHGTL 3 00 

AGGTGGAGGT 3 60 

SGGSGGAGGS 42 0 

NGGEGGIGAL 480 

LIGNGDSGTP 540 

GVGTAGAGGA 600 

AFTAGVTTGG 660 

GGSTGAGGTG 72 0 

SSLSGKAGAG 780 

AGGASDTGAG 840 

AGGTAGAGGA 900 
914 
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AGGAGGSGGW LLGWGGVeGA GGQSLLGGAT GGAG6NAGLF GVGGTGGPGG PGGPGGVGGT 240 

GGAGGLGGTL YGAGGHGGAG GPGPIGGVGG HGGVGQAA6L LGVGGHGGAG GHGAEGVAGA 3 00 

AGEDLSPH6T SGGV66DA.GD G6TGGRGGWL AGAG6AGGA6 GVGGTGGAGG AGFSRALIVA 3 60 

GDNGGDGGNG GMGGA6GA.GG PGGAGGLISL LGGQGAGGAG GTGGAGGVGG DRGAGGPGNQ 420 

5 AFNAGAGGAG GHGGDPGA.GG AGGTGGAGSI TGAQGAIGAT PTSGGMGGAG GNGANATTAG 480 

TNGANGGPGG HGGLVGNG-GA GGNGANGAAG TNASDSGAVG GKGNSGGNGG QGGAGGDGGT 540 

LAGWGGAGGT GGRGADGGLG GSGAEGANAT TAGERGQDGG KGGNGGVGGT GGNAVAPGAN 60 0 

GGHGGNGGNP 6PSGAGGLGG LSGDGVTRAA QGATPDFADT GGKGGNGGNG ANAVAPGGTG 660 

ASGGAGGNAG AGGKGGEiTXI GDGGGGNGGA GGKGGAGTLL GLTVFGDNGG AGVLGDSTDP 720 

10 DGSGGAGGAG GAGGAGGDE>T I ^ 741 



<212> Type : PRT 

<211> Length : 741 

SequenceName : SEQ ID 161 
SequenceDe script ion : 

15 

Sequence 



<213> Organi smName : Mycobacterium tuberculosis H3 7Rv 
<400> PreSequenceS taring : 

20 MSFVTAAPEM LATAAQNV.AN IGTSLSAANA TAAASTTSVL AAGADEVSQA lARLFSDYAT 60 

HYQSLNAQAA AFHHSFVQTB NAAGGAYSSA EAANASAQAL EQNLIiAVINA PAQALFGRPL 120 

IGNGANGTAA SPNGGDGG-IL YGNGGNGFSQ TTAGVAGGAG GSAGLIGNGG NGGAGGAGAA 18 0 

GGAGGAGGWL LGNGGAGGPG GPTDVPAGTG GAGGAGGDAP LIGWGGNGGP GGFAAFGNGG 240 

AGGNGGAS6S IiFGVGGAGGV GGSSEDVGGT GGAGGAGRGL FLGLGGDGGA GGTSNNNGGD 3 00 

25 GGAGGTAGGR LFSLGGDG-GN GGAGTAIGSN AGDGGAGGDS SALIGYAQGG SGGLGGFGES 3 60 

TGGDGGIiGGA GAVLIGTG-"VG GFGGLGGGSN GTGGAGGAGG TGATLIGLGA GGGGGIGGFA 42 0 

VNVGNGVGGL GGQGGQGAAL IGLGAGGAGG AGGATWGLG GNGGDGGDGG GLFSIGVGGD 480 

GGNAGNGAMP ANGGNGGKTAG VIANGSFAPS FVGFGGNGGN GVNGGTGGSG GILFGANGAN 540 

GPS 543 

30 <212> Type r PRT 

<211> BengthL : 543 

SequencelTame r SEQ ID 162 
SequenceDescr-xption t 

35 Sequence 



<213> OrganistnName z Mycobacterium tuberculosis H37Rv 
<400> PreSequenceSt:2ring : 

MSYVLATPBM VAAAAlSlNIiAQ IGSTLSAANA AALAPTTGVL AAGADEVSAA VASLFSGHAQ 60 
40 AYQTIiGTQAA AFHERFIQAIi STAAGAYGSA EAANASPLQQ ALNVINAPTQ TLLGRPLIGN 12 0 

GTNGAPGTGQ AGGPGGLLaYG NGGNGGSGGV GQAGGAGGSA GIiXGIGGTGG AGGAGAVGGV 180 
GGNGGWLYGKT GGAGGLGGTG VAGVNGGMGA AGGAGGMAYL FGSGGAGGQG GMGAAGADGV 240 
NPTPTGTADA GSTGTDQTXiG GNAIGGNGGP GDAGDAMTSG GAGGSGGNAV STVNGDAVGG 3 00 

EGGKGGEGAY GGAGGAGGSA ASIGMAAIGG NGGAGGNAQA PGGVGGAGGE GGDAQVGTNS 3 60 

45 PSNAEAGNGG SGGNGFDSFA SGGTGGAGGT GGAGGRGGLL IGDGGAGGAG GVGGTGGSGA 42 0 

PGGGGGAGGD GGAANTDSAG SSRKAFGGDG GVGGDGASAL GTGGEGGIGG QGGNGGAGGL 480 
LIGNGGAGGV GGTAGAGGTG GSGGAGGAGG AGGGGTNSGP GAAFGGNGNT GGNGGNGGAP 540 
GALGGKGGSG GLIGRAGSI3G GVGAGGAGGA GGAGGTGGEG GTGGDGKTTD GNPGMGGSPG 600 
SAGQPG 606 
50 <212> Type : PRT 

<21X> Length : 606 

SequenceName : SEQ ID 163 
Sec[uenceDescr-iption : 



55 Sequence 



<213> OrganismName = Mycobacterium tuberculosis H37Rv 
<400> PreSequenceS taring : 

MSPLFAQPEM LGAAATDIiAS IGSAISTANA AAAAATTRVL AAGADEVSAA VAALFSGHAQ 60 

60 TYQALRTQAA AFHQQIVQTL TSTAGAYASA EAANVEQQLL GAINAPTMAL LGRPLIGHGA 12 0 

DGAPGTGQAG GAGGILYGJSTG GNGGSGATGQ A6GAGGAAGL IGHGGAGGLG GTGASGGAGG 18 0 

AGGWLWGNGG AGGNGGVGVA GDPGGVGGAG GAGGAAGLWG SGGSGGTGGQ GGVGGGKSGD 240 

GGTGGIG6AG GGGGWLHGIDG GAGGHGGQGG TGVSSGGKTGG AGGTGGDGRG LSGSGGAGGR 3 00 

GGQTGVGGKV GENNFGGAGG AGGTGGLIGN GGAGGJSTGGQG AISGAGGAGG NAWLIGDGGA 3 60 

65 GGNGGDIRGQ GGGAGGAGGA GGQLIGNGGT GGAGGTVTSP NGLGGAGGAG GSAGLIGHGG 42 0 

TGGAGGHSAQ GPEGNGGIGG AGGAGGNGGQ LYGTGGTGGT GGKGGDGFGV F6KGGAGGTG 480 

6RGGAAGLIG DAGTGGTGGK GGTAGEDGTG GNGGTGGNGG AAVIiIGNGGG GGAGGNGGAG 540 
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NDGTPGNGGG GGVGGTGGTL FGQPGQPGPP GQPGPA 576 
<212> Type : E>RT 
<211> Length = 576 

SequenceName : SEQ ID 164 
5 SequenceDescription : 

Sequence 



<213> OrganismMame : Mycobacterium- tuberculosis H37Rv 

10 ' <400> P resequences t ring : 

MWTSQMIVAP AFVDAAAKDL ATIGSAISRA NAEALVPITA LLPAGADDVS AAIAAIiFATH 60 
GQAYQELSAH AVAFHEQFVQ LMSAGAAQYA SAEAANTSSPL QIVGQTALDA INSPVQTLTG 120 
RPLIGNGANG VAGTGQNGGD GGWLYGNGGN GGSGGTGQNG GNGGSAGLWG SGGNGGQGGA 180 
GANGAAGQPG KAGGSGGNGG AGGWIYGHGG HGGAGGNGGN ATAPGGASAG FDGGAGGJTGG 240 

15 SGGRGGLLFG NGGNGSVGGM GGQGTNDTAG DSAGSGGLGG NGGNGAQGGW LIGNGGQGGD 3 00 

SGAGGGTDST QTGVMNGASG GSAGIAGNGG DAGLVGNGGA GGNGGNGAAG SALGTTIFGG 3 60 

SGGVGGSGGD GGISTGGWLFGS GASGGNGGQG GDAGTNGFAG FGGSAGGGGW VGAVNFGPIS 420 
VQGFGLFGHG GDGGNGGDVG AGSLSIQFGA SGGDGGQGGV LYGNGGNGGN AGSGGGTGFE 480 
GSAGQGGAAI LIGNGGAGGN GATGGTGVGN IIQEAGGDGS DGGAGGSGGL LFGSGGAGGI 540 

20 GGAGGVGGSG NDGGNGGDGG QGGASGLGIG NGGPGGSGGT GGAGGTGGSA 6TGGAGGDGG 600 
NAAIiLIGTGG DGGDGVPPAP GGQGGKGGLX GLPGQNGQP 639 
<212> Type : PRT 
<211> Length z 639 

SequenceName : SEQ ID 165 

25 SequenceDescription : 

Sequence 



<213> Organ isTtiName : Mycobacterium tuberculosis H37Rv 
30 <400> P re Sequences t ring : 

MSWVMVSPEL WAAAADLAG XGSAISSANA 

AYQAASAQAA AFYAQFVQAL SAGGGAYAAA 

GAPGTGANGG PGGWLIGNGG AGGSGAPGAG 

GGAGGNAGML FGAAGVGGVG GFSNGGATGG 
35 GAGGNAGTLA TGDGGAGGTG GASRSGGFGG 

AAGGAGGAPG LIGNGGNGGN GGASTGGGDG 

GIGGTGGVLL GLDGFTAPAS TSPLHTLQQD 

DGGAGGWLFG NGGNGGQGTI GGVNGGAGGA 

GGAALLFGSG GAGGSGGAGA VGGNGGAGGN 
40 LFANGGAGGP GGFGSPAGAG GIGGAGGNGG 

GAGGTGGAGS HSTAAGVSGG AGGAGGDAGL 

6LLFGSGGAG GSGGFSNSGN GGAGGAGGDA 

SGAFGLGGDG GAGGATGLSG AFHIGGKGGV 

PSGAGGAGGL LLGENGLNGL M 
45 <212> Type : PRT 

<211> Length = 801 

SequenceName : SEQ ID 166 
SequenceDescription : 

50 Sequence 



<213> Organ isrnName : Mycobacterium tiiberculosis H37Rv 
<400> PreSequenceString : 

GQSYQAVSAQ AAAFHDRFVQ LLNAGGGSYA SAEIAMAQQN LLNAVNAPTQ TLLGRPLVGD 
55 GADGASGPVG QPGGDGGILW GNGGNGGDST SPGVAGGAGG SAGLIGNGGR GGNGAPGGAG 
GNGGLGGLLL GNGGAGGVGG TGDNGVGDLG AGGGGGDGGL GGRAGLIGHG GAGGNGGDGG 
HGGSGKAGGS GGSGGFGQFG GAGGLLYGNG GAAGSGGNGG DAGTGVSSDG FAGLGGSGGR 
GGDAGLIGVG GGGGGNGGDP GLGARLFQVG SRGGDGGVGG WLYGDGGGGG DGGNGGLPFI 
GSTNAGNGGS ARLIGNGGAG GSGGSGAPGS VSSGGVGGAG NPGGSGGNGG VWYGNGGAGG 
60 AAGQGGPGMN TTSPGGPGGV GGHGGTAILF GDGGAGGAGA AGGPGTPDGA AGPGGSGGTG 
GLLFGVPGPS GPDG 
<212> Type : PRT 
<211> Length : 434 

SequenceName : SEQ ID 167 
65 SequenceDescription : 



AAAVNTTGLL TAGADEVSTA 
EAAAVSPLLA PINAQFVAAT 
AGGNGGAGGL FGSGGAGGAS 
AGGAGGAGGL FGAGRERGSG 
AGGAGGDAGM FFGSGGSGGA 
GPGGAGGTGV LIGNGGNGGS 
VINMVNDPFQ TLTGRPLIGN 
GGAGGILFGT GGTGGSGGPG 
AGALLGAAGA GGAGGAGAVG 
LFGAGGTGGA GGGSTLAGGA 
LSLGASGGAG GSGGSSLTAA 
GLLVGSGGAG GAGASATGAA 
GGSAVLIGKTG 6NGGNGGNSG 



lAALFGAQGQ 60 

GRPLIGNGAN 120 

TDVAGGAGGA 180 

GSGNLTGGAG 240 

GGISKSVGDS 3 00 

GGTGATLGKA 360 

GANGTPGTGA 420 

ATGLGGIGGA 480 

GNGGAGGNGG 540 

GGAGGNGGLF 600 

GWGGIGGAG 660 

TGGDGGAGGK 720 

NAGKSGGAPG 780 
801 



60 
120 
180 
240 
300 
360 
420 
434 



Sequence 
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<213> OrganisrnUame :* Mycobacterium tuberculosis H3 7Rv 
<400> Pre Sequences tring : 

MAHFSVLPPE INSLRMYLGA GSAPMLQAAA AWDGIiAAELG TAASSFSSVT 
5 PASAAMAAAA APYAGFLTTA SAQAQLAAGQ AKAVASVFEA AKAAIVPPAA 
LIRSNWLGLN APWIAAVESL YEEYWAADVA AMTGYHAGAS QAAAQLPLPA 
NLGIGNQGNA NLGGGNTGSG NIGUGNKGSS NLGGGNIGNN NIGSGNRGSD 
MIGFGNQGPI DVNLLATPGQ NNVGLGNIGN 
NNFGFGNTGN NNIGIGLTGN NQMGINIiAGL 
10 FNTGANTLVP GDLNNLGVGN SGMANIGFGN" 
AGFVNTGFDN SGNVNTGNGN SGNIISTTGSWN 
FFNTPTGPIiA VDVSGFFNTA SGGTVINGQT 
ISGLFNLRQL LG 
<212>, Type ; PRT 
15 . <211> Length : 552 

SequenceName : SEQ IE) 168 
SequenceDe script ion = 



NNMGFGNTGD ANTGGGNTGN 
liNSGSGNIGI GNSGTNNIGL 
AGVLNTGFGN ASILNTGLGN 
AGNVNTGFGI ITDSGLTNSG 
SGIGNIGVPG TLFGSVRSGXi 



TGIiTGQAWQG 
VAANREAFIiA 
GIiQQFLNTLP 
NFGAGNVGTG 
GNIGGGMTGN 
FNSGSGNIGV 
AGELNTGFGN 
FGNTGTDVSG 
NTGLFNMGTA 



60 
120 
180 
240 
300 
360 
420 
480 
540 
552 



Sequence 
20 

<213> OrganistnName : Mycobacterium tuberculosis H37Rv 
<4 00> PreSequenceString r 

MSFLIASPEA LAATATYLTG IGSAISAANA VAAAPTTEIL AAGTDEVSTA ISALFGAHAQ SO 
AYQALSAHVA AFHDQFVHTL TAGAGSYMAA EAAAASPLQA LQLELLNAIN APTLALLGRP 120 

25 LIGDGTDAAP GSGGAGGAGG IIiIGNGGTGG ASDLAGTGRG GVG6AGGAGG LFGIGGAGGG 180 
CGSAVAIGGD GGAGGAGGVF SGGGAGGAGD AIGGSGGAGG TGGLLGGGGG AGGAGGAGGN 240 
GGGASNSASI GGDGGSGGAG GMLY'GAGGVG GNGGAAVAIG GDGGAGGRAG AXGMGGDGGN" 3 00 

GGTSNTPGGS GGDGGNGGNA GLIGNGGNGG NAEIVISGGS VAGTGGNGGL LLGFNGTNGL 3 60 

P 361 

30 <212> Type : PRT 

<211> Length : 361 

SequenceName z SEQ ro 169 
SequenceDescription r 



35 Sequence 



<213> OrganismName : Mycobacterium tuberculosis H37Rv 
<400> PreSequenceString : 

AQASPAAHGG SGGAGGNGGA GSACTTGGAGG AGGNGGAGGN GGGGDAGNAG SGGNGGKGGD 60 

40 GVGPGSTGGA GGKGGAGAiJG GSSN"GNARGG NAGNGGHGGA GGSGDTGGAG GAGGQGGFGG 120 

TGGSGSGIGG GAGGNGGNGG AGGTGWLGG KGGDGGNGDH GGPATNPGSG SRGGAGGSGG 180 

NGGAGGNATG SGGKGGAGGN GGDGSFGATS GPASIGVTGA PGGNGGKGGA GGSNPNGSGG 24 0 

DGGKGGNGGA GGNGGSIGAN SGIVGGSGGA GGAGGAGGNG SLSSGEGGKG GDGGHGGDGV 300 

GGNSSVTQGG SGGGGGAGGA GGSGFFGGKG GFGGDGGQGG PNGGGTVGTV AGGGGNGGVG 3 60 

45 GRGGDGVFAG AGGQGGLGGQ GGNGGGSTGG NGGLGGAGGG GGNAPDGGFG GNGGKGGQGG 420 
IGGGTQSATG LGGD6GDGGD GGNGGNSGAK AGGAGGKGQA GQPNSGTEPG FGGDGGLGGA 480 
GATP 484 
<212> Type : PRT 
<211> Length : 484 
50 SequenceName : SEQ XD 170 
SequenceDescription : 



Seq[uence 



55 <213> OrganismName : Rickettsia prowazekii 
<4 00> PreSequenceString ; 

MKKSKILRKF LATASLCGTL FTNSNATGTI IPNNGSVSLN TDAGLVGGVF NNGDIIQIVN 60 

GGREIKISAD KANAIIGGIN TLKELPDFGG VEVSQWSIG PIiNAGEDLNT NFGPLKFISN 120 

NVTSIITGVG TKTFSNIDFA GKNATLQINK DLNITTKIDN TVAGNNGSIT FEGSGIISNH 180 

60 IGYTNSLLGI NVGNGEAKIY APEANNITIN AKNINLTHNN SILTLCDGNI TTLKGNINNT 240 

TEIDGQGILN LAYDLGSSSI ITGDIGNIGS LDTINVLLGS ATFNSTILKA TNINLKHNTS 30 0 

TLNLDDNIIV IGNIKGNNNK DIIiNTFKVHGT NLDNEMIIPA PQKTHGTLNF KGNATLNGNI 360 

NNLNILKFSG GHGKTLNLQG NTKVDNLVFA DSVLDSGTIS VNGLLDTDCV TFNNSNVNGG 42 0 

TLIINAKNTI SAKLLNATKA KIQINANLTM NHPSAGDISD IRIADNTIYT IDAKNGNVNL 480 

65 LNNNAKIIFE GADSMLALIN TGVTADRTFT lYNMLNQSGN DEYGIVKIEA IKKVITIANQ 540 

SGPYTIGQDN THRLKELIVB GAGDIIIDDT IFTKLLSINS TGQITFNRTL DLGAGGNIAF 600 

GKHGTLWNG VTGSITTSEN NQGILTINSG NITGVIGTNE LGLKLVNIGA DPVTCSANVF 660 
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ASVALTNPSS VLILADGVTL TGEVTTHZsTNT 
QASNIDSNIY AGSTVLTDQT SELTLNISTDW 
NGAALQEWF NGTTNIGGTA NSQNFTVAHS 
DIDPNNKAGK FIIiGDGAMID GSVLCNGGVA 
5 DNTKNVTIAN DIFVDNIHFT NGGILQLGGN 
NGQNGILNAF TNLKASDDTI GTVKIINIGQ 
ANSQLILSAP VDQTIKFINW LNETGGGIIT 
KGKVTVTNDL DIQNIHQLNI NNGALFDDQS 
LNTSGMVFKH QDSILELKNS SNTNDHTITL 

10 VAYTLGTANH MLKQLTFAS I DNGAIALKVG 
TATGNINGHV DFQGNAGVIN LNDDIEIDGS 
AGAGDVSLSA SGNYSITEIQ GNGNNNLTFA 
IGANAAVGDI IINAGSVNFS NTLKSGN"IVI 
NHTPINITST LGNNNAIGTI EVANNDVTIT 

15 TTAGNNIHTL EVTDFDTGND GIIGDANISTRL 
NVKLNIEGGI TYDLGSKIKS LANVQISEDT 
KNLDIPDALI DLDVLPRSLS LFNYFTDIKA 
KFNDNAWLTQ EIKNANIIEI ASDKFMLLQK 
IVLDLANYEL KYTGNVTHMG LBTIITYFDT 

20 HSDITNITSD TKHQIVKLET GAIYTPVPQT 
GRDDTGGRDD TRGRGNTDNG CRDNCDVGNI 
ILDYTKNNYV ASGIANQLIN HVKDFGN"TTD 
GLNEGWGLN GIEVENFLTD IAINMDN"FTA 
NLKRLNTNNQ AIIAAGDEDN IVTGIWGMSF 

25 DNSIVIGAAY TMADSKVKHK NDKNGDRTKA 
IKNYEKRITT ITDQIAIGKF INTFYSYELL 
ENNTTFQMLS IKKNYYDKFE TILGLNSVTH 
LDGIDEPLTT IRFKPAKITY NLGG6ISTKN 



KGVIiSLGTGS NITGQIGTNS AALEKINIGA 720 

VNSNIITTAG NNSGKLIFTG NGGITGNIGA 78 0 

AANWITGLT TGALKYKDTG TIIAHGGLVG 84 0 

GTLDFIGDGN VTQNIGADNA NSISTINIQG 900 

LTTHNIDFGA NGGTLEFNGN NTYNLNAIIV 960 

IGTPQNFTIQ VNNKNLTLVS SVNSSINFGD 102 0 

IiDSNGNNLTI SGNNGIKLGS KGNELSSLNI 108 0 

LTSAKIKNIN IGTVAGGATY TLDAINDNFD 1140 

TSAIiDPGNNQ FGIIKIiITDT NKLTIDNNGW 1200 

INVENVTLNI KDIELNEVNA NVIiFNKNTTY 1260 

VTSTGNVNGT LNFNGSGKVT GLINNIVMLQ 132 0 

ANSHLTTDIN KTGGQDLNLV FINGGSVSGS 13 8 0 

SDGATMQVNN MVTATDISGK NAiJNGTLKIiN 1440 

GTLQAQNIHF SNATQAATIiT LGAASQVTNJ 150P . 

KSIELTGNGT VTINSPHVYS ' SITTANNAQG 1560 

TIRGDVYSKY LNIDAGKTIN FDRGDNNMNP 162 0 

DNLNFADDTA TANFKDAWI DAHIDNGGIL 168 0 

NIKAATLIAD NANLVLLDNV EVNTNLNVRD 1740 

ALQKGGHIIiV SQGSNVDMSD LDNLIIKIKA 1800 

KVIIDASEEQ NKFVKWVADA NGLVLLTDTG 1860 

SNNSSNEAGG SSSDKJSTYGIT DWPIFDPSP 192 0 

AGKLIiNDLGF MSPNRVTETIi DRLSNRINVN 198 0 

KEIGNRLEEL SDANTVNGLN" KTNTLLNNKI 2040 

YGKIKQNSKN SASGYQSNTG GGIIGFDYNI 2100 

KSNIYSIYGL YDJWLTNNFFV EAIGVYGRNK 2160 

GGYNYLISHR TTITPMFGMR YATFKNWGYK 2220 

YLSQDIIIKP EliHWFINYQC KNKLPNIDAR 2280 

NMIEFGIRYN LSLAKKYTAH QGSLiKIKVNL 2340 



30 <212> Type : PRT 

<211> Length : 2340 

SequenceSTame : SEQ ID 3-71 
SecfueiiceDes crip t ion : 

35 Sequence 



<213> OrganlsmKrame t Rickettsia prowazekii 
<400> PreSequenceString z 

MAQKPNFLKK IISAGLVTAS TATIVAGFSG VAMGAAMQYN RTTNAAATTF DGIGFDQAAG 60 

40 ANIPVAPNSV rXANAmrPIT FNTPUGHUSTS IiFLDTANDLA VTIITEDTTLG FITNIAQQAK 120 

FFNFTVAAGK ILNITGQGIT VQEASNTINA QNALTKVHGG AAINANDLSG LGSITFAAAP 180 

SVLEFNLINP TTQEAPLTLG ANSKIWGGN GTLNITNGFI QVSDNTFAGI KTINIDDCQG 240 

LMFNSTPDAA NTLNLQVGGN TINFNGXDGT GKLVLVSKNG AATEFNVTGT LGGNLKGIIE 3 00 

LNTAAVAGKL ISQGGAANAV IGTDNGAGRA AGFIVSVDNG NAATISGQVY AKNMVIQSAN 3 60 

45 AGGQVTFEHI VDVGLGGTTN FKTADSKVII TENSNFGSTN FGNLDTQIW PDTKILKGNF 420 

IGDVKNNGNT AGVITFNANG ALVSASTDPN lAVTNINAIE AEGAGWELS GIHIAELRLG 48 0 

NGGSIFKLAD GTVINGPVNQ NALMNlSnSTAIiA AGSIQLDGSA IITGDIGNGG VNAALQHITL 540 

ANDASKIIiAL DGANIIGANV GGAIHFQANG GTIKLTNTQN NIWNFDLDI TTDKTGWDA 600 

SSLTNNQTLT INGSIGTWA NTKTIiAQLNI GSSKTILNAG DVAINELVIE NNGSVQLNHN" 660 

50 TYLITKTINA ANQGQIIVAA DPLNTNTTLA DGTNLGSAEN PLSTIHFATK AANADSILNV 72 0 

GKGVNLYANN ITTNDANVGS LKFRSGGTSI VSGTVGGQQG HKLlsTtsTLILDN GTTVKFLGDT 780 

TFNGGTKIEG KSILQISNNY TTDHVESADN TGTLEFVNTD PITVTLNKQG AYFGVLKQVI 840 

ISGPGNIVFN EIGNVGIVHG lAANSISFEN ASLGTSLFLP SGTPLDVLTI KSTVGNGTVD 900 

NFNAPIWVS GIDSMINNGQ IIGDKIOSrilA LSLGSDNSIT VNANTLYSGI RTTKNNQGTV 960 

55 TLSGGMPNNP GTIYGLGLEN GSPKLKQVTF TTDYNNLGSI lAlIlSrVTINDY VTLTTGGIAG 1020 

TDFDAKITLG SVNGNANVRF VDSTPSDPRS MIVATQANKG TVTYLGNALV SNIGSLDTPV 1080 

ASVRFTGNDS GAGLQGNIYS QNIDFGTYNL TILNSNVILG GGTTAINGEI DLLTNNLIFA 1140 

NGTSTWGDNT SISTTLNVSS GNIGQWIAE DAQVNATTTG TTTIKIQDNA NANFSGTQAY 1200 

TLIQGGARFN GTLGAPNFAV TGSNIFVKYE LIRDSNQDYV LTRTNDVLNV VTTAVGNSAI 1260 

60 ANAPGVSQNI SRCIiESTNTA AYNNMLLAKD PSDVATFVGA lATDTSAAVT TVNLNDTQKT 1320 

QDLLSNRLGT LRYLSNAETS DVAGSATGAV SSGDEAEVSY GVWAKPFYNI AEQDKKGGIA 13 8 0 

GYKAKTTGW VGLDTLASDN LMIGAAIGIT KTDIKHQDYK KGDKTDINGL SFSLYGSQQL 1440 

VKNFFAQGNA IFTLNKVKSK SQRYFFESNG KMSKQIAAGN YDNMTFGGNL IFGYDYNAMP 15 0 0 

NVLVTPMAGL SYLKSSNENY KETGTTVANK RIKTSKFSDRV DLIVGAKVAG STVNITDIVI 1560 

65 YPEIHSFWH KVNGKLSNSQ SMLDGQTAPF ISQPDRTAKT SYlSTIGLSANI KSDAKMEYGI 162 0 

GYDFNSASKY TAHQGTLKVR VNF 1643 
<212> Type : PRT 
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<211> Length : 1643 

SequenceName : SEQ ID 172 
SequenceDescription : 

5 Sequence 

<213> OrganistnName : Porphyromonas gingivalis W83 
<400> PreSequenceString : 

M2UlIIIiEAHD VWEDGTGYQM LWDADHNQYG ASIPEESPWF ANGTIPAGLY DPFEYKVPVN 60 

10 ADASFSPTNF VLDGTASADI PAGTYDYVII NPNPGIIYIV GEGVSKGNDY WEAGKTYHF 120 

TVQRQGPGDA ASWVTGEGG NEFAPVQNLQ WSVSGQTVTL TWQAPASDKR TYVLNESFDT 180 

QTLPNGWTMI DADGD6HNWL STINVYNTAT HTGDGAMFSK SWTASSGAKI DLSPDNYLVT 240 

PKFTVPENGK LSYWVSSQEP WTNEHYGVFL STTGNEAANF TIKLLEETIiG SGKPAPMNLV 3 00 

KSEGVKAPAP-.YQERTIDLSA YAGQQVYLAF RHFGCTGIFR LYLDDVAVSG EGSSNDYTYT 360 

15 VYRDNWIAQ NLTATTFNQE NVAPGQYNYC VEVKYTAGVS PKVCKDVTVE GSNEFAPVQN 420 

IiTGSAVGQKV TLKWDAPNGT PNPNPGTTTL SESFENGIPA SWKTIDADGD GNNWTTTPPP 48 0 

GGSSFAGHMS AICVSSASYI NFEGPQNPDN YLVTPELSLP NGGTLTFWVC AQDANYASEH 540 

YAVYASSTGN DASNFANALL BEVLTAKTW TAPEAIRGTR VQGTWYQKTV QLPAGTKYVA 60 0 

FRHFGCTDFF WINLDDVEIK ANGKRADFTE TPESSTHGEA PAEWTTIDAD GDGQGWLCLS 660 

20 SGQLGWLTAH GGTNWASFS WNGMALNPDN YLISKDVTGA TKVKYYYAVN DGFPGDHYAV 72 0 

MISKTGTNAG DFTWFEETP NGINKGGARF GLSTEANGAK PQSVWIERTV DLPAGTKYVA 780 

FRHYNCSDLN YILLDDIQFT MGGSPTPTDY TYTVYRDGTK IKEGLTETTF EEDGVATGNH 84 0 

EYCVEVKYTA GVSPKECVNV TVDPVQFNPV QNIiTGSAVGQ BCVTLKWDAPN GTPNPNPGTT 90 0 

TLSESFENGI PASWKTIDAD GDGNNWTTTP PPGGTSFAGH NSAICVSSAS YINFEGPQNP 960 

25 DNYLVTPELS LPNGGTLTFW VCAQDANYAS EHYAVYASST GNDASNFANA LLEEVLTAKT 102 0 

WTAPEAIRG TRVQGTWYQK TVQLPAGTKY VAFRHFGCTD FFWINLDDVE IKANGKRADF 108 0 

TETFESSTHG EAPAEWTTID ADGDGQGWLC LSSGQLDWLT AHGGTNWAS FSWNGMALNP 1140 

DNYIiISKDVT GATKVKYYYA VNDGFPGDHY AVMISKTGTN AGDFTWFEE TPNGINKGGA 120 0 

RFGIiSTEANG AKPQSVWIER TVDLPAGTKY VAFRHYNCSD LNYILLDDIQ FTMGGSPTPT 1260 

30 DYTYTVYRDG TKIKEGLTET TFEEDGVATG NHEYCVEVKY TAGVSPKECV NVTVDPVQFN 132 0 

PVQNLTGSAV GQKVTLKWDA PKGTPNPNPG TTTLSESFEKT GIPASWKTID ADGDGNNWTT 13 8 0 

TPPPGGTSFA GHNSAICVSS ASYINFEGPQ NPDNYLVTPE LSLPNGGTLT FWCAQDANY 1440 

ASEHYAVYAS STGKDASNFA. NAIiriEEVLTA KTWTAPEAI RGTRVQGTWY QKTVQLPAGT 150 0 

KYVAPRHFGC TDPFWINLDD VEIKAKTGKRA DPTETFESST HGEAPAEWTT IDADGDGQGW 1560 

35 LCLSSGQLGW LTAHGGTNW ASFSWBTGMAIi NPDNYLISKD VTGATKVKYY YAVNDGPPGD 1620 

HYAVMISKTG TNAGDFTWF EETPNGINKG GARFGLSTEA NGl^JCPQSVWI ERTVDLPAGT 1680 

KYVAFRHYNC SDLNYILLDD IQFTMGGSPT PTDYTYTVYR DGTKIKEGLT ETTFEEDGVA 1740 

TOKIHEYCVBV KYTAGVSPKE CVNVTINPTQ PNPVQNLTAE QAPNSMDAin KWNAPASKRA 18 0 0 

EVLNEDFENG IPASWKTIDA DGDGNNWTTT PPPGGSSFAG HKTSAICVSSA SYINFEGPQN 1860 

40 PDNYLVTPBL SLPGGGTLTP WVCAQDANYA SEHYAVYASS TGlsTDASNFAN ALLEEVLTAK 192 0 

TWTAPEAIR GTRVQGTWYQ KTVQIiPAGTK YVAFRHFGCT DFFWINLDDV VITSGNAPSY 1980 

TYTIYRNNTQ lASGVTETTY RDPDLATGPY TYGVKWYPN GESAIETATL NITSLADVTA 2 040 

QKPYTLTWG KTITVTCQGE AMIYDMNGRR IiAAGRlSTTWY TAQGGHYAVM WVDGKSYVE 210 0 

KLAVK 2105 

45 <212> Type : PRT 

<211> Length : 2105 

SeqpaenceName : SEQ ID 173 
SequenceDescription : 

50 Sequence 



<213> OrganisraName : Porphyromonas gingivalis W83 
<400> PreSequenceString : 

MKTSERILSY FFLLCAVPSIj GSCEGLYAQV TFPNYSPTAA SSIAVCSGEE TLIIDFTWQ 60 

55 EDSNGIKVNV KLADGVEYW GTAWSVTQG NAVTVAETNV SNPNEPVFTV KSADGNNWE 120 

LGTIVKLTIK RRAVCTAWSN AINAAETGPV PKDKVTVTIG DHSDSKESNS YSVNYPNLTI 180 

KQPAPQVNKQ IGBTIVREFS ITNGSQNPTQ TVYLSIBYPD EAYLTGVGAM TLQAKLGASG 240 

TYADLTPTVT NGKVRIYTLS GSSLGPDHLL TNGEIIYLKE TFKLKTCAPV TVYRVGWGCS 3 00 

IDSQCEIKTT AATITMAAGA ANITGYSVTG PDYRSPTFSL CQPFELTIKF SNSGAGGSMG 360 

60 AAFNINTIGR NDYYRPRGPV LHEFIDVKVN GKPVTNPKTD GSELDLRPDG QFTEDPDGPG 420 

VGLDDVDGDG PYDDLPVGAT ITITVTVRLK CDQFTACaaNA PNDLSDRGLI LKTLYQTSCD 480 

RTSWIDPNTW FNLSSTHLYL SRESVQDASH MPTVIEKDTP FDLKIMTSYY SILSSYNNIW 540 

YANPNTRYW EIVFPQGMTM PPKSDIEWTN IKNHPIDGSL VFTPPINLPD ANITTSGNTM 600 

TIVSPSQERG FVTLHGVKYD CTNUJHEMWE YKIREVFNYL HFPDCLCPVG PIMCNTAKRY 660 

63 VLGCDPPCGR GAETSVPKIE RADNSLGWTD YTMRTRQSRS NISAYDLAKA LYMDEVNITA 720 

TSIQHGTASS LGARFVLATG VDRVETLTPL SADIKIFRDG VQIVSVDGYT TFRSIRRNISIN 780 

AEQVIDWDFT SILPAGGLLD RDKVDWTRY RVTSQNAHRV DTQVGREWFP YNSTANVSPI 840 



wo 2005/076010 



58/341 



PCT/IN2005/000037 



WDEANPLTCL ILVPEIYIMG TFWNGTDPH VISQCTPTDL GRVANHYARR FGSGAFEYAN 90 0 

EYRPGVKIRKT lYLKVPKSYT LNRVEYSNHR NHSSLGTTMP FEBINHTDVT SQGEYNIYKY 960 

QLADNEKAHF KTITVKNAYGA ALKVNVSPTC ASSAVATKTYD KISYYVDY^D YYYYAATQPT 1020 

VPNSLDIVAD QSAGSN6IYS VSALKTVYNRP ILYTNKPSXA LVNQSGEVEIi VGKTGEWKXiR 10 8 0 

5 ISNPSSATAP YVWIiALPTTS GLTIEKVTDA AGTEMAFTTY SGGKMYRLSE AGVPVGSALD 1140 

YTIHFTYSGC SPIALKAMGG WNCSAYPLSL DEYVCSSQVI DLKLKPLPAA MEIiTEIAVPD 120 0 

PTAAATLCST LEYIYSIQST DNANVYSPTP SIPPEEGIiW TPNQVQVEYP AGSGNWAALN 12 60 

WNNSVNLLQ HPALTTIGYL KGLKEGESND NQRKILVKFY IKTECSFVSG KNFRVRADGR 132 0 

NACNQNAKGS GIiAISTPPIR INGAIEPYTT SASTQLVTTT TSQSDCKAPK RVKWQTWG 13 80 

10 GETTPKAYLE ITLPLGFKYV TGSYAPDNTH PGGVNASPAG TEEVTLTANG EDKIKINVKA 1440 

GIiTSGQSFAY TLEMKEDDDN VPACGNHTIE IVNVEEIEGL WCEGVQCAET LWTGANKFE 1500 

FELDKPYLDI TVISAVSTFS GGKENLTIEY KVSNTSTTQP LKPGAWTLF SDKDNNQVFS 15 60 

GGDVAVATQE LVAEITNTTP BTQIMKVKGV SSSHTGNIiVIi TILPKDGCYC EIKSPMVTLN 1620 

HLPSIJYWIGG TVGKPNEWKE PNNWTNDQVP DAAEDVEFAT EVNNPTDPmi PKSGPAKENL 16B0 

15 HLDDIHQNGT AGRVIGKTLIKT DSDKDLVITT GNQLTINGW EDNNPNVGTI WKSSKDNPT 1740 

GTLIiFANPGKT NQ3S3VGGTVEF YNQGYDCADC GMYRRSWQlfF GIPVNESDFP YDHVDGNATV 18 00 

NQWVEPFNGD KWRPAPYAPD TKLQKPKGYQ ITNDVQAQE»T GVYSFKGTLC VCDAFLNLTR 1860 

TSGVNYSGAN LIGNSYTGAI DIKQGIVFPP EVEQTVYTiFN TGTRDQWRKL NGSTVSGYRA 1920 

GQYL&VPKNT AGQDNLPDRI PSMHSFLVKM QNGASCTLQI LYDKLIiKNTT WNGNGTQIT 1980 

20 WRSGNSGSAN MPSLVMDVIiG NESADRLWIP TDGGLSFGFD NGWDGRKLTE KGLSQLYAMS 2040 

DIGNDKFQVA GVPELNNLLI GFDADKDGQY TLEPALSDHF AKGGVFLEDL SRGVTRRWD 2100 

GGSYSFDAKR GDSGARFRLS YDEEWVESAE VSVLVGTAGK RIVITNNSEH ACQANVYTTD 2160 

GKLLIRLDVK PGSKSMTEPL VDGVYWSLQ SPATSSNVRK WVN 2204 
<212> Type : PRT 

25 <211> Length : 2204 

SequenceName : SEQ ID 174 
SequenceDescription : 



30 



35 



40 



45 



50 



55 



60 



65 



Sequence 

<213> OrganisraHame z Porphyromonas gingival! 
<400> PreSequenceString r 

MBIKFYBCSIiLQ SGLAAFVSMA TADTASAQIS FGGEPLSF'SS 
NPEDIiIAQSR WQSQRDGRPV RXGQVIPVDV DFASKASH^IS 
TLYYDAPNIP BGGRLYrYTP DHBIVLGAYT NATHRRNGAF 
TLPDIKISGA GYIFDKVGGR PVTDNHY6IG EDDSDSDCEX 
IMVKGQYXSM CSGWLLNNTK GDFTPIiIISA GHCASXTTNF 
CSNGTIiATFR GKTSIXGASMK AFLPIKGECSD GLLLQLNOEV 
GAGIHHPAGD AMKXSIIiKKT PALNTWISSS GSGGTDDHFY 
KKHWGTrtTG GAGNCGGTEF YGRBNSHWNE YASDGNTSKM 
DGYKPLPSVP RIiLLQSTGDQ VELNWTAVPA DQYPSSYQVE 
AXDESIXGSG IIRYEVSARF lYPSPLDGVB SYKDTDKXSA 
GGVSLSWKVP FLSQLVSRFG ESPNPVFKTF EVPYVSAAAA 
PEKAAIAAVY VMPSAPDSTF HLFLKSNTNR RLQKVTTE>SD 
HMLFAGIRMP NKYKLNRAIR YVRNPDNLFS ITGKKISITNN 
LWNTDAPKX DMSLVQEPYA KGTNVAPFPE LVGIYVYKNG 
SDEYBIKIiVY KGSGISNGVA QXENNNAWA YPSWTDRPS 
RSWNNLRNGV TFSVQGLTAG TYMLVMQTAN GPVSQKIVKQ 
<212> Type : PRT 
<211> Length : 940 

SequenceName : SEQ ID 175 

SequenceDescription : 

Sequence 



s W83 

RSAGTHSFDD 
SIGDVDVYRL 
ATEPVPGSEIi 
NINCPEGADW 
GVTQSELDKW 
PLRYRVYYNG 
FKYDQGGTEG 
DIYLDPQNNG 
YHIFRNGKEX 
DLAIGDIQTK 
QTPNPPVGW 
WQAGTWLRIN 
GVSFEGYGIP 
TFIGTQDPSV 
XKNAHMVHAA 



AMTIRLTPDF 
QFKLEGAKAI 
IMDYEVSRGG 
QAEKNGWQM 
IFTFHYEKRG 
WDSTPDIPSS 
GSSGSSLFNQ 
QTTILNGTYR 
ATTKELSYSD 
LKPDVTPLPG 
lADKFMAGTY 
LDKPFPVlSrND 
SIiLGYMAIKY 
TTYSVSDGTE 
AIiYSLDGKQV 



<213> OrganistriKTame : Porphyromonas gingivalis W83 
<400> PreSequenceString : 

MKNLNKFVSI ALCSSLLGGM AFAQQTELGR NPNVRLIiEST QQSVTKVQFR MDNLKFTEVQ 
TPKGMAQVPT YTEGVNLSEK GMPTLPXLSR SLAVSDTREM KVEWSSKFI EKKNVLIAPS 
KGMIMRNEDP KKXPYVYGKS YSQNKFFPGE lATLDDPFIL RDVRGQWNF APLQYNPVTK 
TLRIYTEITV AVSETSEQGK NILNKKGTFA GFEDTYKRMF MNYEPGRYTP VEEKQNGRMI 
VIVAKKYEGD IKDFVDWKNQ RGLRTEVKVA EDIASPVTAN AIQQPVKQEY EKEGNDLTYV 
LLIGDHKDIP AKITP6IKSD QVYGQIVGND HYNEVFIGRF SCESKEDLKT QIDRTIHYER 
NITTBDKWLG QALCIASAEG GPSADNGESD XQHENVXAISIL LTQYGYTKII KCYDPGVTPK 
KIXDAFNGGI SLANYTGHGS ETAWGTSHFG TTHVKQIiTNS NQLPFIFDVA CVNGDFLFSM 
PCFABALMRA QKDGKPTGTV AIIASTINQS WASPMRGQDE MNEXLCEKHP NNIKRTFGGV 
TMNGMFAMVE KYKKDGEKML DTWTVPGDPS LLVRTLVPTK MQVTAPAQXN LTDASVIJVSC 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
940 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
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DYNGAIATIS ANGKMFGSAV VENGTATINL TGLTNESTLT LTWGYNKET VIKTINTNGE 660 
PNPYQPVSNL TATTQGQKVT LKWDAPSTKT NATTNTARSV DGIREtiVLLS VSDAPELLRS 720 
GQAEIVLEAH DVWNDGSGYQ ILIiDADHDQY GQVIPSDTHT LWE^NCSVPAN LFAPFEYTVP 780 
ENADPSCSPT NMIMDGTASV NIPAGTYDFA lAAPQANAKI WIAGQGPTKE DDYVFEAGKK 840 
5 YHFLMKKMGS GDGTELTISE GGGSDYTYTV YRDGTKIKEG LTATTFEEDG VAAGNHEYCV 900 
EVKYTAGVSP K\7CKDVTVEG SNEFAPVQNL TGSAVGQKVT LKWDAPNGTP NPNPNPNPNP 960 

NPGTTTLSES FENGIPASWK TIDADGDGHG WKPGNAPGIA GYNSNGCVYS ESFGLGGIGV 102 0 

LTPDNYLITP ALDLPNGGKL TFWCAQDAN YASEHYAVYA SSTGNDASNF TNALLEETIT 1080 

AKGVRSPEAI RGRIQGTWRQ KTVDLPAGTK YVAFRHFQST DMFYIDLDEV EIKANGKRAD 1140 

10 FTETFESSTH GEAPAEWTTI DADGDGQGWL CLSSGQLDWL TAHGGTNWS SFSWNGMALN 12 00 

PDNYLISKDV TGATKVKYYY AWDGFPGDH YAVMISKTGT NAGDFTWFE ETPNGINKGG 12 60 

ARFGLSTEAD GAKPQSVWIE RTVDLPAGTK YVAFRHYNCS DLNYILLDDI QFTMGGSPTP 13 20 

TDYTYTVYRD GTKIKEGLTE TTFEEDGVAT GNHEYCVEVK YTAGVSPKKC VNVTVNSTQF 13 80 

NPVKNLKA.QP DGGDV\n:,KI'?S ^JE-SAKKTEGS REVKRIGDGL FVTIEPAHDV RA.HKAKYVI1A 1440 

15 ADNVWGDNTG YQFLLDADHN TFGSVIPATG PLFTGTASSD LYSANFEYLI PANADPWTT 1500 

QNIIVTGQGE WIPGGVYDY CITMPEPASG KMWIAGDGGM QPARYDDFTF EAGKKYTFTM 1560 

RRAGMGDGTD MEVEDDSPAS YTYTVYRDGT KIKEGLTETT YRDAGMSAQS HEYCVEVKYT 1620 

AGVSPKVCVD YIPDGVADVT AQKPYTLTW 6KTITVTCQG EAMIYDMNGR RLAAGRWTW 1680 

YTAQGGYYAV MWVDGKSYV EKLAIK 1706 

20 <212> Type : PRT 

<211> Length : 1706 

SequenceNarae : SEQ ID 176 
SequenceDe script ion : 

25 Sequence 



<213> OrganistnName : Porpliyrpmonas gingivalis W83 

<400> PreSequenceString : 

MKRKPLFSAL VILSGFFGSV HPASAQKVPA PVDGERIIME LSEADVECTI KIEAEDGYAN 60 

DIWADLNGMG KYDSGERLDS GEFRDVBFRQ TKAIVYGKMA KFLFRGSSAG DYGATFIDIS 120 

NCTGLTAFDC FANLLTELDL SKANGLTFVN CGKNQLTKLD LPANADIETL NCSKNKITSL 180 

NLSTYTKLKE LYVGDNGLTA LDLSANTLLE ELVYSNNEVT TINLSANTNIi KSLYCINNKM 240 

TGLDVAANKE LKILHCNNNQ LTALNLSANT KLTTLSFFNN EliTNIDLSDN TALEWIiFCNG 3 00 

NKLTKLDVSA NANLIALQCS NNQLTALDLS RTPKLTTLNC YSNRIKDTAM RALIESLPTI 360 

TEGEGRFVPY NDDEGGEEEN VCTTEHVEMA KAKNWKVLTS WGEPFPGITA LISXEGESEY 420 

SVYAQDGILY LSGMEQGLPV QVYTVGGSMM YSSVASGSAM EIQLPRGAAY WRIGSHAIK 480 

TAMP 484 
<212> Type t PRT 
<211> Length : 484 

SequenceHarae : SEQ ID 177 

SequenceDescription : 

Sequence 



<213> OrganistnName : Shigella flexneri 2a str. 2457T 

<400> PreSequenceString : 

MKRAITLFAV LLMGWSVNAW SFACKTANGT AIPIGGGSAN VYVNLiAPWN VGQNLWDLS 60 

TQIFCHNDYP ETITDYVTLQ RGSAYGGVLS NFSGTVKYSG SSYPFPTTSE TPRWYNSRT 120 

DKPWPVALYL TPVSSAGGVA IKAGSLIAVL ILRQTNNYNS DDFQFWNIY ANNDVWPTG 180 

GCDVSARDVT VTLPDYPGSV PIPLTVYCAK SQNLGYYBSG TTADAGNSIF TNTASFSPAQ 240 

GVGVQLTRITG TIIPANNTVS LGAVGTSAVS LGLTANYART GGQVTAGNVQ SIIGVTFVYQ 3 00 

<212> Type : PRT 
<211> Length : 300 
55 SequenceNarae : SEQ ID 178 

Sec[uenceDescription : 

Sequence 



<213> OrganismName : Shigella flexneri 2a str. 2457T 
<400> PreSequenceString : 

MGIKQHNGNT KADRLAELKI RSPSIQLIKF GAIGLNAIIF SPLIilAADTG SQYGTNITIN 60 

DGDRITGDTA DPSGNLYGVM TPAGNTPGNI NLGNDVTVNV NDASGYAKGI IIQGKNSSLT 120 

ANRLTVDWG QTSAIGINLI GDYTHADLGT GSTIKSNDDG IIIGHSSTLT ATQFTIENSN 180 

GIGLTINDYG TSVDLGSGSK IKTDGSTGVY IGGLNGNNAN GAARFTATDL TIDVQGYSAM 240 

GINVQKNSW DLGTNSTIKT NGDNAHGLWS FGQVSANALT VDVTGAAANG VEVRGGTTTI 300 

GADSHISSAQ GGGLVTSSSD ATIKTFSGTAA QRNSIPSGGS YGASAQTATA VINMQNTDIT 360 



wo 2005/076010 



60/341 



PCT/IN2005/000037 



VDRNGSLALG LWALSGGRIT GDSIiAITGAA GARGIYAMTKf SQIDLTSDLV IDMSTPDQMA 42 0 

lATQHDDGYA ASRINASGRM LINGSVLSKG GLINIiDMHPG SVWTGSSLSD NVNGGKLDVA 480 

MNNSVWNVTS NSNLDTIiALS HSTVDFASHG STAGTFTTLN VENLSGNSTF IMRADWGEG 540 

NGVNNRGDLL NISGSSAGNH VLAIRNQGSE ATTGNEVLTV VKTTDGAASF SASSQVELGG 60 0 

5 YLYDVRKNGT NWELYAS6TV PEPTPNPEPT PAPAQPPIVN PDPTPEPAPT PKPTTTADAG 660 

GNYLNVGYLL NYVENRTLMQ RMGDLRKTQSK DGNIWLRSYG GSLDSFASGK LSGFDMGYSG 72 0 

IQFGGDKRLS DVMPLYVGLY IDSTHASPDY SGGDGTARSD YMGMYASYMA QNGFYSDLVI 78 0 

KASRQKNSFH VLDSQNNGVN ANGTANGMSI SLEAGQRFNL SPTGYGFYIE PQTQLTYSHQ 840 

NEMAMKASNG MIHUSIHYES LLGRASMILG YDITAGNSQL NVYVKTGAIR EFSGDTEYLL 900 

10 NDSREKYSFK 6NGWNNGVGV SAQYKTKQHTF YIiEADYTQGN LFDQKQVNGG YRFSF 955 

<212> Type : PRT 
<211> Length : 955 

SequenceNarae : SEQ .ID X79 
15 SequenceDe script ion : 

Sequence 

<213> OrganisitiKTame : Shigella, flexneri 2a str. 2457T 

20 <400> PreSequenceString : 

MSKFVKTAIA AAMVMGVPTS TATIAAGNNG TARFYGTIED SVCSIVPDDH KLEVDMGDIG 60 
AEKLKNNGTT TPKSFQIRLQ DCVFDTQETM TTTFTGTVSS ANSGNYYTIF NTDTGAAFNN 120 
VSLAIGDSLG TSYKSGMGID QKIVKDTSTN KGKAKQTLNF NAWLVGAADA PDIjGNFEAWT 180 
TFQITYL 187 

25 <212> Type : PRT 

<211> Length : 187 

SequenceName : SEQ ID 180 
SequenceDe script ion : 

30 Sequence 



<213> OrganismNarae : Shigella flexneri 2a str. 245 7T 
<400> PreSequenceString : 

t^lKTLAIW LSALSLSSAA ALADTTXVNG GTIHFKGEW NAACAVDAGS VDQTVQLGQV 60 
35 RTASLE3QAGA TSSAVGFNIQ LNDCDTXVAT KAAVAFLGTA IDATRTDVLA LQSSAAGSAT 120 

NVGVQILDRT GNALTLDGAT FSAQTTXjNNG TNTIPFQARY YAIGEATPGA ANADATFKVQ 180 

YQ 182 

<212> Type r PRT 

<211> Length : 182 
40 SequenceName : SEQ ID X81 

SequenceDescription : 

Sequence 



45 <213> OrganismName : Shigella, flexneri 2a str. 2457T 
<400> PreSequenceString : 

MASISSLGVG SGLDLSSILD SLTAAQKATL TPISNQQSSF TAKLSAYGTL KSALTTFQTA 60 
NTALSKADLF SATSTTSSTT AFSATTAGNA lAGKYTISVT HLAQAQTLTT RTTRDDTKTA 120 
lATSDSKLTI QQGDDKDPIT IDISAANSSL SGIRDAINNA KAGVSASIIN VGNGEYRLSV 180 

50 TSNDTGLDNA MTLSVSGDDA LQSFMGYDAS ASSNGMEVSV AAQNAQLTVN NVAIENSSNT 240 
ISDALENITL NLNDVTTGNQ TLTITQDTSK VQTAIKDWVN AYNSLIDTFS SLTKYTAVDA 300 
GADSQSSSNG ALLGDSTLRT IQTQLKSMLS NTVSSSSYKT LAQIGITTDP SDGKLELDAD 360 
KLTAALKKDA SGVGALIVGD GKKTGITTTI GSNLTSWLST TGIIKAATDG VSKTLNKLTK 420 
DYNAASDRID AQVARYKEQF TQLDVLMTSL NSTSSYLTQQ FENNSNSK 468 

55 <212> Type : PRT 

<211> Length : 468 

SequenceName : SEQ ID 182 
SequenceDescription : 

60 Sequence 



<213> OrganismName : Shigella flexneri 2a str. 2457T 
<400> PreSequenceString : 

MBGKADNWL ENGGRLDVLT GHTATNTRVD DGGTLDVRNG GTATTVSMGN GGVLLADSGA 60 

65 AVSGTRSDGK APSIGGGQAD ALMLEKGSSF TLNAGDTATD TTVNGGLFTA RGGTLAGTTT 120 

LNNGAILTLS GKTVNNDTLT IRBGDALLQG GALTGNGSVE KSGSGTLTVS NTTLTQKAVN 180 

LNEGTLTIifcID STVTTDVIAQ RGTALKLTGS TVLNGAIDPT NVTLASGATW NIPDNATVQS 240 



wo 2005/076010 



61/341 



PCT/IN2005/000037 



WDDLSHAGQ IHFTSTRTGK FVPATLKVKN LNGQNGTISL RVRPDMAQNN ADRLVIDGGR 3 00 

ATGKTILNLV NAGNSASGLA TSGKGIQWB AINGATTEEG APIQGNKLQA GAFNYSLNRD 3 60 

SDESWYIiRSE NAYRAEVPLY ASMLTQAMDY DRILAGSRSH QTGVSGENNS VRLSIQGGHI. 420 

GHDNNGGIAR GATPESSGSY GFVRLEGDIjIi RTEVAGMSVT AGVYGAAGHS SVDVKDDDGS 480 

5 RAGTVRDDAG SLGGYLNLIH NASGLWADIV AQGTRHSMKA SSDNNDFRVR 6WGWLGSLET 540 

GI.PFSITDNL MLEPQLQYTW QGLSLDDGQD NASYVKFGHG SAQHVRAGFR LGSHHDMNFG 600 

KGTSSRDTLR GSAKHSVREL PVNWWQPSV IRTFSSRGDM SMGTAAAGSN MTFSPSQNGT 660 

SLDLQAGLBA RVREKTITLGV QASYAHSIKTG SSABGYNSQA TLNVTF 706 
<212> Type : PRT 
10 <211> Length : 706 

SequenceNarae : SEQ ID 183 

SequenceDescription : 



Sequence . . 
15 

<213> OrganistnName : Shigella £lexneri 2a str. 2457T 
<400> PreSequenceString : 

MAFSQAVS6L MTAAATNLDVI GNNIAbTSATlir GFKSGTASFA DMFAGSKVGL GVKVAGITQD 60 
PTDGTTTNTG RGLDVAISQN GEFRLVDSlsTG SVFYSRNGQF KLDENRNLLN TQGLQLTGYP 12 0 

20 VTGTPPTIQQ GAKfPTNISIP NTLMAAKTTT TASMQINBNS SDPLPTVTPF SASNADSYNK 180 
KGSVTVFDSQ GNAHDMSVYF VKTGDNNWQV YTQDSSDPNS lAKTATTLEF NANGTLVDGA 240 
MANNIATGAI NGAEPATFSL SFLNSMQQN-T GANNIVATTQ NGYKPGDLiVS YQINDDGTW 3 00 

GNNSNEQTQL LGQIVLANFA NNEGLASEGD NVWSATQSSG VALLGTAGTG NFGTIiTNGAIi 360 
EASNVDLSKE LWMIVAQRN YKSNAQTIKT QDQILNTRVN LR 402 

25 <212> Type : PRT 

<211> Length : 402 

SequenceName : SEQ ID 184 
SequenceDescription : 

30 Sequence 



<213> OrganistnName : Shigella f lexneri 2a str. 2457T 
<400> PreSequenceString : 

MKLVHMASGL AVAIALAACA DKSADXQTPA PAANTSISAT QQPAXQQPNV- SGTVWIRQKV 60 
35 ALPPDAVLTV TLSDASLADA PSKVLAQKZW RTEGKQSPFS FVLPFNPADV QPKTARILLSA 120 

AITVNDKLVF ITDTVQPVIN QGGTKADLTIi VPVQQTAVPV QASGGATTTV PSTSPTQVNP 180 

SSAVPAPTQY 1^0 

<212> Type : PRT 

<211> Length = 190 
40 SequenceETame r SEQ ID 185 

SequenceDescription : 

Secjuence 



45 <213> OrganisraKTame : Shigella f lexneri 2a str. 2457T 
<400> PreSequenceString : 

MIIKKSGGRW QLSLLASWI SAFFLNTAYA WQQEYIVDTQ PGHSTERYTW DSDHQPDYND 60 

ILSQRIQSSQ RALGLEVNLA EETPVDVTSS MSMGWNFPLY EQVTTGPVAA LHYDGTTTSM 120 

YNEFGDSTTT LTDPLWHASV SSLGWRVDSR LGDLRPWAQI SYNQQFGENI WKAQS6LSRM 180 

50 TATNQNGNWL DVTVGADMLL NQNIAAYAAL TQAENTTNNS DYLYTMGVSA RF 232 



<212> Type : PRT 
<211> Length : 232 

SequenceName : SEQ ID 186 
55 SequenceDescription : 

Sequence 



<213> OrganisinName : Shigella f lexneri 2a str. 2457T 
60 <400> PreSequenceString : 

MKWCKRGYVL AAMLALASAT IQAADVTITV NGKWAKPCT VSTTNATVDL GDLYSFSLMS 60 
AGAASAWHDV ALELTNCPVG TSRVTASFSG AADSTGYYKN QGTAQNIQLE LQDDSGNTLN 120 
TGATKTVQVD DSSQSAHFPL QVRALTVNGG ATQGTIQAVI SITYTYS 167 
<212> Type : PRT 
65 <211> Length : 167 

SequenceName : SEQ ID 18*7 
SequenceDescription : 
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Sequence 



<213> OrganisttiName : Sliigella flexneri 2a str. 2457T 
5 <400> PreSequenceString : 

MKRAPLITGL LLISTSCAYA SSGGCGADST SGATNYSSW DDVTVNQTDN VTGREFTSAT 60 
LSSTNWQYAC SCSAGKAVKL VYMVSPVLTT TGHQTGYYKIi NDSLDIKTTL QAlJDIPGIiTT 120 
DQWSVNTRF TQIKSSTVYS AATQTGVCQG DTSRYGPVNI GAWTTFO^YV TKPFLGSMTI 180 
PKTDIAVIK6 AWVDGMGSPS TGDFHDLVKL SIQGNLTAPQ SCKINQGDVI KVNFGFINGQ 240 

10 KFTTRNAMPD GFTPVDFDIT YDCGDTSKIK NSLQMRIDGT TGWDQYNLV ARRRSSDNVP 300 
DVGIRIBNLG GGVANIPFQN" GILPVDPSGH 6TVNMRAWPV KIiVGGELETG KFQGTATITV 360 
MVR 363 
<212> Type : PRT 
<211> Length : 363 

15 Sequencelsrame : SEQ ID 188 

Sec[uenceDescription : 

Secjuence 



20 <213> OrganistnKratne : Shiigella flexneri 2a str. 2457T 
<400> PreSequenceString : 

MQKKTAAHTYA ISSLLVLSIiT GCAWIPSTPIi VQGATSAQPV PGPTPVAITGS IFQSAQPINY 60 

GYQPLFEDRR PRNIGDTLTI VLQENVSASK SSSAKTASRDG lOTTFGFDTVP RYLQGLFGNA 120 

RADVEASGGKT TFNGKGGANA SNTFSGTLTV TVDQVLVNGN LHWGEKQIA INQGTEFIRF 180 

25 SGWNPRTIS GSKTVPSTQV ADARIEYV6N GYINBAQNMG WLORFFIiNLS PM 232 



<212> Type : PRT 
<211> Length : 232 

SequenceKTame : SEQ ID 189 
30 SequenceDescription : 

Sequence 



<213> OorganistnKrame : Shigella flexneri 2a str. 2457T 

35 <400> PreSequenceString r 

MKRHLNTCYR LVPffNHITGAF WASELARAQ GKRGGVAVAL SLAAVTSLPV LAADIWHPG 60 

ETVNGGTLVN HDNQPVSGTA DGVTVSTGLE LGPDSDENTG GQWXKAGGTG RNTTVTAISrGR 12 0 

QIVQAGGTAS DTVXRDGGGQ SIiNGLAVNTT LDNRGEQWVH GGGKAAGTII IsTQDGYQTIKH 180 

GGLATGTIVN TGAEGGPESE KTVSSGQMVGG TAESTTINKN GRQVIWSSGM ARDTLIYAGG 240 

40 DQTVHGEAHN TRLEGGNQYV HNGGTATETL IJSTRDGWQVIK EGGTAAHTTI NQKESCR 297 



<212> Type : PRT 
<211> Length : 297 

SequenceKTame : SEQ ID 190 
45 SequenceDescription : 

Sequence 



<213> OrganistnName : Shigella flexneri 2a str. 245 7T 
50 <400> PreSequenceString : 

MMMKTIKHLL CCAIAASALI STGVHAASWK DALSSAASEL GNQNSTTQEG GWSLASLTML 6 0 

LSSGNQALSA DNMNNAAGIL QYCAKQKLAS VTDAENIKNQ VLEKLGLNSE EQKEDTIJYLD 12 0 

GIQGLLKTKD GQQLNLDNI6 TTPLAEKVKT KACDLVLKQG LNPIS 165 
<212> Type : PRT 
55 <211> Length : 165 

SequenceName : SEQ ID 191 
SequenceDescription : 

Sequence 
60 

<213> OrganisTnName : Sh.igella flexneri 2a str. 2457T 
<400> PreSequenceString : 

MFKGQKTLAA LAVSLLFTAP VYAADEGSGE IHFKGEVIEA PCEIHQDDID KEVELGQVTT 60 
SHINQSHHSD AVAVDLLLVN CDLENSSNGS GGKISKVAVT FDSSAKTTGA DPILNNTSTG 120 
65 EATGVGVRLM NKDQSNIVLG TATPDIDLAP TSSEQTLNFF AWMEQIDQAT PVTPGAVTAN 180 
ATYVLDYK 188 
<212> Type : PRT 
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<211> Length : 188 

SequenceName : SEQ ID 192 
SequenceDescription : 

Secfuence 



<213> OrganisiiiNarae : Shigella flexneri 2a. str. 2457T 
<400> PreSequenceString : 

MSAGSPKFTV RRIAALSLVS LWIAGCSDTS NPPAPVSSVN GNAPANTNSG MLITPPPECMG 60 

TTSTAQQPQI QPVQQPQIQA TQQPQIQPVQ PVAQQPVQME NGRXVYlSrRQY GKflPKGSYSG 120 

STYTVKKGDT LPYIAWITGKT DPRDLAQRNN IQAPYALNVG QTLQV6NASG TPITGGNAIT 180 

QADAAEQ6W IKPAQNSTVA VASQPTITYS ESSGEQSAJSTK MLPNNKPTAT TVTAPVTVPT 240 

ASTTEPTVSS TSTSTPISTW RWPTEGKVIE TFGASEGGNK GIDIAGSKGQ AIIATADGRV 3 00 

VYAGNALRGY GNIiIIXKHlSrD DYLSAYAHND TMLVREQQEV KAGQKIATMG STGTSSTRLH 3 6P 

FEIRYKGKSV NPLRYLPQR 3 79 
<212> Type : PRT 
<211> Length : 379 

SequenceName : SEQ ID 193 

SecfuenceDescription : 

Sequence 



<213> OrganisraName : Shigella flexneri 2a. str. 2457T 
<400> PreSequenceString : 

MIKPLSALIL LIjVTTAAQAE RIRDLTSVQG VRQNSLIGYG LWGLDGTGD QTTQTPFTTQ 60 
TLNNMLSQLG ITVPTGTNMQ LKNVAAVMVT ASLPPFGRQG QTIDVWSSM GNAKSLRGGT 120 
LLMTPLKGVD SQVYALAQGKT ILVGGAGASA GGSSVQWQL NGGRITNGAV lERELPSQFG 180 
VGNTLNLQLN DEDFSMAQQI ADTINRVRGY GSATALDART IQVRVPSGNS SQVRFLADIQ 240 
NMQVNVTPQD AKWINSRTG SWMNREVTL DSCAVAQSNL SVTVNRQANV SQPDTPFGGG 3 00 

QTWTPQTQI DLRQSGGSLQ SVRSSASLNN WRAUSTALiGA TPMDLMSILQ SMQSAGCLRA 3 60 

KLBII 3 65 

<212> Type : PRT 
<211> Length : 365 

SequenceName : SEQ ID 194 

SequenceDescription : 

Sequence 



<213> Organ T-striKTame r Shigella flexneri 2a str. 2457T 
<400> PreSequenceString : 

MKRSIXAAAV FSSFPMSAGV FAADVDTGTL TIKGNIAESP CKFEAGGDSV SINMPTVPTT 60 
VFEGKAKYST YDDAVGVTSS MLKISCPKEV AGVKLSIiXTN DKITGNDKAI ASSNDTV6DN 120 
SDVLDVSAPF NIESYKTAEG QYAIPFKAKY LKLTDITSVQS ODVLSSLVMR VAQD 174 

<212> Type : PRT 
<211> Length : 174 

SequenceName : SEQ ID 195 

SequenceDescription : 

Sequence 

<213> OrganisraName : Shigella flexneri 2a. str. 2457T 
<400> PreSequenceString : 

MAVQKNVIKG IIiAGTFALML SGCVTVPDAI KGSSTTPQQD LVRVMSAPQL YVGQEARFGG 60 
KWAVQNQQG KTRLEIATVP LDSGARPTLG EPSRGRI'2'AD VNGFLDPVDF RGQLVTWGP 120 
ITGAVDGKIG NTPYKFMVMQ VTGYKRWHLT QQVIMPPQPI DPWFYGGRGW PYGYGGWGWY 18 0 

NPGPARVQTV VTE 193 
<212> Type : PRT 
<211> Length : 193 

SequenceName : SEQ ID 196 

SequenceDescription : 

Sequence 



<213> OrganismName : Shigella flexneri 2sl str. 2457T 
<400> PreSequenceString : 

MRNKPPYLLC APLWLAVSRV LAADSTITIR GYVRDNGCSV AAESTNFTVD LMENAAKQFN 60 
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NIGATTPWP FRILLSPCGN AVSAVKVGFT GVADSHNANL LALENTVSAA AGLGIQLLWE 120 
QQNQIPLNAP SSAISWTTLT PGKPNTLNFY ARLMATQVPV TAGHINATAT FTLEYQ 176 

<212> Type : PRT 
<211> Length : 176 

SequenceName : SEQ ID 197 

SequenceDescription : 

Sequence 

<213> OrganisitiName : Shigella flexneri 2a str. 245 7T 
<400> PreSequenceSfcring : 

MKKLTVAALA VTTLLSGSAF AHEAGEFFMR AGSATVRPTE GAGGTLGSLG GFSVTNNTQL 60 
GLTFTYMATD NIGVELLAAT PFRHKIGTRA TGDIATVHHL PPTLMAQWYF GDASSKFUPY 120 
VGAGINYTTF FDNGFNDHGK EAGLSDLSLK DSWGAAGQVG VDYIiINRDWIi VNMSVWYMDI 180 
DTTANYKLGG AQQHDSVRLD PWVFMFSAGY RF 212 
<212> Type : PRT 
<211> Length : 212 

SequenceName : SEQ ID 198 

SequenceDescription : 

Sequence 



<213> OrganismName : Shigella flexneri 2a str. 2457T 
<4 00> PreSequenceString : 

MFFKRGKILS AGRLMKKSLG IVMFLSVGLL LAGCSGSKSS DTGTYSGSVY TVKRGDTLYR 60 
ISRTTGTSVK ELARLNGISP PYTIEVGQKL KLGGAKSSSS TRKSTAKSTT KTASVTPSSA 12 0 

VPKSSWPPVG QRCWLWPTTG KVIMPYSTAD GGNKGIDISA PRGTPIYAAG AGKWYVGNQ 180 
LRGYGNLIMI KHSEDYITAY AHNDTMLVNIT GQSVKAGQKI ATMGSTDAAS VRLHFQIRYR 240 
ATAIDPLRYLj PPQGSKPKC 259 
<212> Type : PRT 
<211> Length : 259 

SequenceName : SEQ ID 199 

SequenceDescription : 

Sequence 



<213> OrganismName : Shigella flexneri 2a str. 2457T 
<400> PreSequenceString : 

MAQVINTNSL SLITQNNINK NQSALSSSIE RLSSGLRINS AKDDAAGQAI ANRFTSNIKG 60 

LTQAARNAND GISVAQTTEG ALSEINWNLQ RIRELTVQAS TGTNSDSDLD SIQDEIKSRL 12 0 

DEIDRVSGQT QFNGVNVLAK DGSMKIQVGA NDGQTITIDL KKIDSDTLGL NGFNVNGGGA 18 0 

VANTAASKAD LVAANATWG NKYTVSAGYD AAKASDLLAG VSDGDTVQAT INNGFGTAAS 24 0 

ATNYKYDSAS KSYSFDTTTA SAADVQKYLT PGVGDTAKGT ITIDGSAQDV QISSDGKITA 3 00 

SNGDKLYIDT TGRLTKNGSG ASLTEASLST LAANNTKATT IDIGGTSISF TGNSTTPDTI 3 60 

TYSVTGAKVD QAAFDKAVST SGNNVDFTTA GYSVNGTTGA VTKGVDSVYV DNNEALTTSD 42 0 

TVDFYLQDDG SVTNGSGKAV YKDADGKLTT DAETKAATTA DPLKADDEAI SSIDKFRSSL 48 0 

GAVQNRLDSA VTNLNNTTTN LSEAQSRIQD ADYATEVSNM SKAQIIQQAG NSVLAKANQV 540 

PQQVLSLLQG 550 
<212> Type : PRT 
<211> Length : 550 

SequenceName : SEQ ID 200 

SequenceDescription : 

Sequence 

<213> OrganismName : Shigella flexneri 2a str. 2457T 
<400> PreSequenceString : 

MKKIACLSAL AAVLAFTAGT SVAATSTVTG GYAQSDAQGQ MNKMGGFNLK YRYEEDNSPL 60 

GVIGSFTYTE KSRTASSGDY NKNQYYGITA GPAYRINDWA SIYGWGVGY GKFQTTEYPT 120 

YKHDTSDY6F SYGAGLQFNP MENVALDFSY BQSRIRSVDV 6TWIAGVGYR F 171 

<212> Type : PRT 
<211> Length : 171 

SequenceName : SEQ ID 201 

SequenceDescription : 
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Sequence 



<213> OrganisttdSTame : Shigella flexneri 2a str. 2457T 
<400> PreSequenceString : 
5 MKRNIIGGAF TLASLMLAGH ALAEDGWNF VGEIVDTTCE VTSDTADQIV PLGKVSKNAF 60 
SGVGSIiASPQ KFSIKLENCP ATYTQAAVRF DGTEAPGGDG DLKVGTPIiTA GNPGDFTGTG 120 
QAIAATGVGI RIFNQSDNSQ VKLYNDSAYT AIDAEGKABM RFXARYVATN ATVTAGTANA 18 0 

DSQFTVBYKK 190 
<212> Type : PRT 
10 <211> Length : 190 

SequenceName : SEQ ID 202 

SequenceDescription : 



Sequence 
15 

<213> Organ! siiiName : Shigella flexneri 2a str. 2457T 
<400> PreSequenceString : 

MKKSTLALW MGIVASASVQ AAEIYNKDGM KLDVYGKVKA MHYMSDNASK DGDQSYIRFG 6 0 

PKGETQINDQ LTGYGRWEAE FAGNKAESDT AQQKTRLAFA GliKYKDLGSF DYGRNLGALY 12 0 

20 DVEAWTDMFP EFGGDSSAQT DNFMTKRASG LATYRNTDFF GVIDGLNLTL QYQGKNENRD 180 
VKKQNGDGFG TSLTYDFGGS DFAISGAYTN SDRTNEQNLQ SRGTGKRAEA WATGLKYDAN 240 
NIYIjATFYSE TRKMTPITGG FANKTQNFEA VAQYQFDFGIi RPSLGYVLSK GKDIEGIGDE 3 00 

DLVNYIDVGA TYYFNKNMSA FVDYKINQLD SDNKLNINND DTVAVGMTYQ F 351 



25 <212> Type : PRT 

<211> Length : 351 

SequenceName : SEQ ID 203 
SequenceDescription : 

30 Sequence 



45 



<213> OrganismName : Shigella flexneri 2a str. 2457T 
<400> PreSequenceString : 

MRKQWLGICI AAGMLAACTS DDGQQQTVSV PQPAVCNGPI VEISGADPRF EPLNATANQD 60 
35 YQRDGKSYKX VQDPSRFSQA GLAAIYDAEP GSNLTASGEA FDPTKLTAAH PTLPIPSYAR 12 0 

ITNLANGRMI WRINDRGPY GNDRVISLSR AAADRLNTSW NTKVRIDPII VAQDGSLSGP 180 
GMACTTVAKQ TYALPAPPDL SGGAGTSSVS GPQGDILPVS NSTLKSEDPT GAPVTSSGFL 24 0 

GAPTTLAPGV LEGSEPTPAP QPWTASSTT PATSPAMVTP QAASQSASGN FMVQVQAVSD 3 00 

QARAQQYQQQ LGQKFGVP6R VTQNGAVWRI QLGPFASKAE ASTLQQRLQT EAQLQSFITT 3 60 

40 AQ 3 62 

<212> Type : PRT 
<211> Length : 362 

SequenceName : SEQ ID 204 
SecjuenceDescription : 



Sequence 



<213> OrganismName : Shigella flexneri 2a str. 2457T 
<400> PreSequenceString : 
50 MKKKTIYQCV ILFFSLLNIH VGMAGPEQVS MHIYGNWDQ GCDVATKSAL QNIHIGDFNI 60 
SDFQAANTVS TAADLNIDIT GCAAGITGAD VLFSGEADTL AE>TLLKLTDT GGSGGMATGI 120 
AVQILDAQSQ QEIPLNQVQP LTPLKAGDNT LKYQLRYKST KA.GATGGNAT AVLYFDLVYQ 18 0 

<212> Type : PRT 
55 <211> Length : 180 

SequenceName : SEQ ID 205 
SequenceDescription : 



Sequence 
60 

<213> OrganismName : Shigella flexneri 2a str- 2457T 
<400> PreSequenceString : 

MKNKLLFMML TILGAPGIAA AAGYDLANSE YNFAVNELSK SSFNQAAIIG QAGTNNSAQL 60 
RQGGSKLLAV VAQEGSSNRA KIDQTGDYNL AYIDQAGSAN DASISQGAYG NTAMIIQKGS 120 
65 GNKANITQYG TQKTAVWQR QSQMAIRVTQ R 151 
<212> Type : PRT 
<211> Length : 151 
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SequeixceName : SEQ ID 206 
SequenceDescription : 

Sequence 

<213> OrganistriName : Shigella flexneri 2a str> 2457T 
<400> PreSequenceString : 

MMKFKKCLLP VAMTiASPTLA GCQSNADDHA ADVYQTDQLN TKQETKTVNI ISILPAKVAV 60 
DNSQNKRNAQ AFGALIGAVA GGVIGHNVGS GSNSGTTAGA VGGGAVGAAA GSMWDKTLV 120 
EGVSLTYKEG TKVYTSTQEG KECQFTTGLA WITTTYNET RXQPNTKCPE KS 172 

<212> Type : PRT 
<211> Length : 172 

SequenceKame : SEQ ID 207 

SequenceDescription : 

Sec[uence 



<213> OrganisraKTame : Shigella flexneri 2a str - 2457T 
<400> PreSequenceString : 

MQTKKNEIWV GIFLLAALIiA ALFVCLKAAN VTSIRTESTY TLYATPDNIG GLKARSPVSI 60 
GGWVGRVAD ITLDPKTYIiP RVTLEIEQRY NHIPDTSSLS IRTSGLLGEQ YLALliTVGFED 120 
PELGTAILKD GDTIQDTKSA MVLEDLIGQF LY6SKGDDNK KTSGDAPAl^AP GNKTETTEPVG 180 
TTK 183 
<212> Type : PRT 
<211> Length : 183 

SequenceUTame : SEQ ID 208 

SequenceDescription : 

Sequence 



<213> OrganismName : Shigella flexneri 2a str. 245 7T 
<400> PreSequenceString r 

MAPLliFSAQS LAESLTVEQR LELLEKALRE TQSELKKYKD EEKKKYTPAT VNRSVSTIIDQ 60 

GYAANPFPrS SAAKPDAVLV KNEEKNASET GSIYSSMTLK DFSKFVKDEI GFSYNGYYRS 120 

GWGTASHGSP KSWAIGSLGR FGNEYSGWFD LQLKQRVYNE WGKRVDAWM IDGNVGQQYS 180 

TGWFGDKTAGG ENFMQFSDMY VTTKGFLPFA PEADFWVGKH GAPKIEIQML DWKTQRTDAA 240 

AGVGLEWWKV GPGKIDXALV REDVDDYDRS LQNKQQrNTTH TIDLRYKDIP LWDKATLMVS 3 00 

GRYVTANESA SEKDNQDNNG YYDWKDTWMF GTSLTQKFDK GGFNEFSFLV ANNSIARNFG 3 60 

RYAGASPFTT FNGRYYGDHT GGTAVRLTSQ GEAYIGDHFI VANAIVYSFG NNIYSYETGA 420 

HSDFESIRAV VRPAYIWDQY NQTGVELGYF TQQNKDANSlsr KFNESGYKTT LFHTFKVNTS 480 

MLTSRLEIRF YATYIKALEN ELDGFTFEDN KDAQFAVGAQ AJEXWW 525 
<212> Type : PRT 
<211> Length : 525 

SequenceName : SEQ ID 209 

SequenceDescription : 

Sequence 



<213> OrganismName : Streptococcus mutans UA159 
<400> PreSequenceString : 

MKKRILSAVL VSGVTLSSAT TLSAVKADDF DAQIASQDSK X15NLTAQQQA AQAQVNTIQG 60 
QVSALQTQQA ELQAENQRLE AQSATLGQQI QTLSSKIVAR MESLKQQARS AQKSMAATSY 120 
INAIINSKSV SDAINRVSAI REWSANEKM LQQQEQDKAA VEQKQQENQA AINTVAANQE 180 
TIAQNTNALN TQQAQLEAAQ LNLQAELTTA QDQKATLVAQ KAAAEEAARQ AAAAQAAAEA ' 240 
KAAT^EAKALQ EQAAQAQAAA NNNTQATDVS DQQAAAADNT QAAQTGDSTE QSAAQAVNNS 3 00 

DQESTTATEA QPSASSASTA AVAANTSSAN TYPAGQCTWG VKSIiAPWVGN YWGNGGQWAA 3 60 

SAAAAGYRVG STPSAGAVAV WNDGGYGHVA YVTGVQGGQI QVQEANYAGN QSIGNYRGWF 420 
NPGSVSYIYP N 431 
<212> Type : PRT 
<211> Length : 431 

SequenceName : SEQ ID 210 

SequenceDescription : 

Sequence 

<213> OrganismName : Streptococcus mutans UA159 
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<400> PreSequenceString : 

MKVKKTYGFR KSKISKTLiCG AVLGTVAAVS VAGQKVFADE TTTTSDVDTK WGTQTGKTPA 60 

TNIiPEAQGSA SKEAEQSQNQ A6ETNGSIPV EVPKTDLDQA AKDAKSAGVN WQDADVNKG 12 0 

TVKTAEEAVQ KETEIKEDYT KQAEDIKKTT DQYKSDVAAH EAEVAKIKAK NQATKEQYEK 180 

5 DMAAHKAEVE RINAANAASK TAYEAKLAQY QADLAAVQKT NAANQAAYQK ALAAYQAELK 240 

RVQEANAAAK AAYDTAVAAN NAKNTEIAAA NEBIRKRNAT AKAEYETKLA QYQAELKRVQ 300 

EANAANEADY QAKLTAYQTE IiARVQKANAD AKAAYEAAVA AKNAKNAALT AENTAIKQRIT 360 

ENAKATYEAA LKQYEADLAA VKKANAANEA DYQAKLTAYQ TELARVQKAN ADAKAAYEAA 420 

VAANNAANAA LTAENTAIKK RHADAKADYE AKLAKYQADL AKYQKDIiADY PVKLKAYEDE 480 

10 QASIKAALAE LEKHKNEDGKT LTEPSAQNLV YDLEPNANLS LTTDGKFLKA SAVDDAFSKS 540 

TSKAKYDQKI LQLDDLDITN LEQSNDVASS MELYGKTFGDK AGWSTTVSNlSr SQVKWGSVXiL 600 

ERGQSATATY TNLQNSYYNG KKISKIVYKY TVDPKSKFQG QKVWLGIFTD PTLGVFASAY 660 

TGQVEKNTSI FIKKTEFTFYD EDGKPINFDN ALLSVASLNR ENNSIEMAKD YTGKFVKISG 720 

SSIGEKNGMI YATDTLMFRQ GQGGARWTMV TRASEPGSGW DSSDAPNSWY GAGAIRMSGP 780 

15 NNSVTLGAIS STLWPADPT MAIETGKKPN IWYSLNGKIR AVNVPKVTKE KPTPPVKPTA 840 

PTKPTYETEK PLKPAPVAPM YEKEPTPPTR TPDQAEPNKP TPPTYETEKP LEPAPVEPSY 900 

EAEPTPPTRT PDQAEPNKPT PPTYETEKPL EPAPVEPSYE AEPTPPTPTP DQPEPNKPVE 960 

PTYEVIPTPP TDPVYQDLPT PPSVPTVHFH YFKIiAVQPQV NKEIRNNNDI NIDRTLVAKQ 1020 

SWKFQLKTA DLPAGRDETT SFVLVDPLPS GYQFNPEATK AASPGFDVTY DISTATNTVTFK 10 8 0 

20 ATAATLATFN ADLTKSVATI YPTWGQVLN DGATYKNNFT LTVNDAYGIK SNWRVTTPG 1140 

KPNDPDNPNN NYIKPTKVNK NEMGWIDGK TVLAGSTNYY ELTWDLDQYK NDRSSADTIQ 12 0 0 

KGFYYVDDYP EEALELRQDL VKITDANGNE VTGVSVDNYT NLEAAPQEIR DVLSKAGIRP 1260 

KGAFQIFRAD NPREFYDTYV KTGIDLKIVS PMWKKQMGQ TGGSYENQAY QIDFGNGYAS 13 2 0 

NIIINNVPKI NPKKDVTLTL DPADTNNVDG QTIPLNTVFN YRLIGGIIPA DHSEELFEYM 13 8 0 

25 FYDDYDQTGD HYTGQYKVFA KVDITFKDGS IIKSGAELTQ YTTAEVDTAK GAITIKFKEA 144 0 

FLRSVSXDSA FQAESYIQMK RIAVGTFENT YINTVNGVTY SSNTVKTTTP EDPTDPTDPQ 15 0 0 

DPSSPRTSTV INYKPQSTAY QPSSVQETLP NTGVTNNAYM PLLGII6LVT SFSLLGIiKAK 1560 

KD 1562 
<212> Type : PRT 

30 <211> Length : 1562 

SequenceName : SEQ ID 211 
SeguenceDescription : 



Sequence 
35 

<213> OrganismName : Streptococcus mutans UA159 
<40 0> PreSequenceString : 

MIiTEIjKAVLK KPMLWITMVG VAIiVPALYNI XFLSSMWDPY GKVSDLPVAV VNKDKTATYE 60 

GKKMTIGKDM TDNMVRNKSL DYHFVDSEKA QKGLEKGDYY MIITLPEDLS QNAASVIiTDE 12 0 

40 PKKLTIPYQT SKGHSFVASK MSETAAKTLK ESVSKNITSS YTKSLFKNMS TLKTGLGSAA 18 0 

NASQKIATGS KQLANGSQVM TDNLNLLSNS SQSFAQGTNT LYSGLTAYTG GVGQLSAGLN 24 0 

NLNNGLTAYT NGVGQLANGS SQLSNQSQKL LGGVAQLAJSTG SASIQQLVNA SSQLNQGLIK 3 00 

LSTATGLSEE QVQQFSSLIN QLGTLNQSIQ NYSDNGTATT ANSPDLSTYL SAITTAAQAI 3 60 

VNSGNTSQQT TTNQSNALAA VQATGAYQRL SAEDQSEIAA ALANTGSSTT TTGADANAVS 42 0 

45 QAQAILNNVQ SIQSALSTLQ TTTANTPTSP SASLTQIKNT ANSVLPSAAT SLTTLSSGLT 480 

QAKTALDSQV VPVSTALANG TAQLGSTFST GANSLMTGVG QYTNAVDILN AGANTIiAAKN 540 

NQLTDGTSQL VNGANQLNSN SGQLTKGTAQ LANGANQIET GAGKLAAGGE SLTAGLTTLS 600 

SGSGELSKAL STAKNKLSLV AVDNDNAKTL SSPVTIKHTD KDNVKTNGVG MAPYMMSAAL 660 

MVMAISTNTI FRVALSGKQA KTLREWIDQK LAVNGLIAVT GAIILYFGVH IIGLSANFEL 72 0 

50 KTLGIillLTS ITFMVLVTTL VTWHDKFGSF AALILLLLQL GSSAGTYPIA VTDKFFQWN 780 

PYLPMSYSVS GLRETISMAG TIGNQLLALS LFFLTFAALG LLIARRRIRS VKVA 834 

<212> Type : PRT 
<211> Length : 834 
55 SequenceName : SEQ ID 212 

SequenceDe script ion : 

Sequence 



60 <213> OrganismName : Streptococcus mutans UA159 
<4 00> PreSequenceString : 

MVSQKNKSKK GQSKTFTLIS NRINLLFFLI VALFTVLLLR LAQMQLYDAK FYKSKLTEST 60 

TYTIKTSSPR GQIYDAKGVA LVENEVKEW AFTRSKTTMTA KDIKANAKKL ADMVTLTESK 120 

VTKRQKKDYY LADPKNYQKI VKKLPNNKKY DNFGNNLTES KIYANAVKAV PNSAIDYSED 180 

65 EKKIIHIFSQ MNATSVFNTA SLTTGDLTAE QIAVLATSKS DLKGISVKTD WERKTDKNSI 24 0 

TSIIGKVSSQ KTGLPAEEAN NYVKKGYSLN DRVGTSYLEK QYEl^LQGSR TVQAIKVNKE 3 00 

6KIISDKTTA KGTKGKNLKL TLDLEPQKGV EQILNQYFNS ELASGNTKYS EGVYAWLKTP 3 60 
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NTGA.VLSMAG LEHDLICTGEV SSNALGAVTE VFTPGSWKG ATLTAGWENG VXiSGNQVLND 420 
QPIQFAGSSP INSWFTNGST PLTASQSIiEY SSNTYMVQLA LKLMGQDYHS GMTLSTDGYK 480 
EAMEKLRATY AQYGLGVSTG IDLPGESKGY TPEHYDPSNV LTESFGQFDN YTAMQIAQYA 540 
AAVANGGKRI APHLVEGIYD NNKTGGLGNL VQSIDTKVLN NVSISSDDMG riKEGPYNW 600 
NGGSYATGKT LAKGASVPIS AKTGTAEAYV TGDDGKSVYT SNLNWAYAP SSNPQIAVAV 660 
VLPHETDLHG TTSHAITRDI INLYQKMYPM NQ 692 
<212> Type : PRT 
<211> Length : 692 

Sec[uenceName : SEQ ID 213 

SeguenceDescription : 



Sequence 



<213>. OrganisttiName : Streptococcus lautans UA159 
<400> PreSequenceString : 

MTVLKYGLGI LLSAIILAII IGGLLPTYYV SSTPKLSEAK LKATNSSLVY DSNNNLIADL 60 

GABKRESISS DSrPMKLVNA VTSIBDHRFF KHRGVDIYRI IGAAWSNLLH ECSTQGGSTLD 120 

QQLIKLAYFS TKESDQCTLKR KAQEVWLSIiQ MEKKYTKEEI LTFYVNKVYM SNGNYGMRTA 180 

AKSYYGKDLK DLSIAQLATL AGIPQAPTQY DPYAQPKAAT SRRNTVLSQM YKHKKITKRE 240 

YDAAVATPIS DGLQELKRSS SYPKYMDNYL KQVISEVKKR TGQDIFSAGM KVYTNWADA 3 00 

QQYLWNIYNT DEYIAYPDDN FQVASTVMDV TNGKVIAQLG GRHQDTKTVSF GTNQAVLTDR 3 60 

DWGSTMKPIS AYGPALESEA FTTTAQMLND SVYYYPGTTT QVYDWDHRYN GWMTIQTAIQ 420 

QSRNVPAVRA IDAAGLDTAK GFLSGLGIDY PEMRYSNAIS SNTSSSEQKY SASSEKMAAA 480 

YAAFSNGGTY YEPQYVNKIE FKDGTSETYD AKGNRAMKET TAYMMTDMLK TVLTYGTGTE 540 

AAIPGLYQAG KTGTSNYDDN ELVEMSEKLG INPYGL6TIA PDENFVGYTP QYSMAVWTGY 600 

KNRLMPVYGD SMKIAAQVYR TMMAYIiSSSG NSDWTMPDGL YRSGGYLYLKT GSSGSNSRYG 660 

AAPATSSSSS SSSSSDSNNN DQNNNQTTEA SSDSSSSSSD ATTSSNP 707 
<212> Type : PRT 
<211> Length : 707 

SeguenceName r SEQ ID 214 

SequenceDescription : 



Sequence 



<213> OrganismName z Streptococcus mutans UA159 
<400> PreSequenceString r 

MKSKiaKITL LSSLALAAFG ATNVFADEAS TQLNSDTVAA PTADTQASEP AATEKEQSPV 60 
VAWESHTQG NTTTTTSQVT SKELEDAKAKT ANQEGLEVTE TEAQKQPSVE AADADNKAQA 120 
QTIN'EAVADY QKAKAEFPQK QEQYNKDFEK YQSDVKEYEA QKAAYEQYKK EVAQGLASGR 180 
VEKAQGLVFI NEPEAKLSIB GVNQYLTKEA RQKHATBDIL QQYNTDNYTA SDFTQANPYD 240 
PKEDTWFKMK VGDQISVTYD NIVNSKYKDK KISKVKINYT LNSSTNNEGS MiVNLFHDPT 3 00 

KTIFIGAQTS NAGKNDKISV TMQIIFYDEN GNEIDLSGNN AIMSLSSLNH WTTKYGDHVE 360 
KVNLGDNEFV KIPGSSVDLH GNEIYSAKDN QYKANGATFN GDGADGWDAV KTADGTPRAAT 420 
AYYGAGAMTY KGEPFTFTVG GNDQNLPTTI WFATNSAVAV PKDPGAKPTP PEKPELKKPT 480 
VTWHKNLWE TKTEEVPPVT PPTTPDEPTP EKPKTPEDPQ SPWAKSVSF RTARKGEMRV 540 
RERDYQPTLP HAGAAKQNGL ATLGAISTAF AAATLIAARK KEN 583 
<212> Type : PRT 
<211> Length : 583 

SequenceName : SEQ ID 215 

SequenceDescription : 

Sequence 



<213> OrganismName : Streptococcus mutans UA159 
<400> PreSequenceString : 

MBQKIPSKRK SKIAGLCGAI LTTTWALAS GTVIEADETI EQPVAAETVS QADGDNPEQT 60 
TSVQQETAPQ QTKTSQSSDA TVDSEESATS PSDEQTVSQN DSNSSSQIDQ TIADTNRSDS 120 
DHISKTSAAT TEDQEEKVNS AKAQTAAATN NQDTRYSAKD AYGNSNFNKT LTEF6KNANV 180 
ADVTYNGVRD EYIWNDPSA PYVPNANEIA KYLKEYLTEL RNINNIAIPV IPSVDQVMQKY 240 
AQDRANEEAN EKNGLDHDTN LPIPNNLTWV AEDGHLDMDS SIQSKSQEGY TLASDKATAY 300 
YLALNWFSDY FNIYDDPNDG LKSFGHAVSI LSDGGTGMGL GLASGQDNEK C3MWYAQLEFG 360 
GNDNBDNTND FSSLKNGKGE WVLYYKGSPV KFLPNTTFWY VKKGTSPDAA STPHNSDKPS 420 
FQSSKDLDPN FKADNRFQEG KEASVHQAIP ATFKSHRDEV GNKDQNSLSA QLPDTGVQKN 480 
NQLALIALGT GLILLSGLLL SKRKSLK 507 
<212> Type : PRT 
<211> Length : 507 

SequenceName : SEQ ID 216 
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SeguenceDe script ion : 
Sequence 

5 <213> OrganismName : Streptococcus mutans UAX59 

<4 00> PreSequenceString : 

MTFEKQKHFS LRKLKFGLVS VAIIAFLFAV 

TSNQVEAKTD SANKDPQEKT GSVATDAPSM 

TDLPQNSFKQ QSAHVKMTTE AEKTPSHSIN 
10 MYFAQDGKQV KGAFAQDSDG NKHYYDRDSG 

NGQSLYFNSD GSQVKGNFVE EDGSLRYYDK 

KIEWKTSLV VDSYEFGPSV SKIILEFNHK 

GHWYFDSSH YVTLELDIPY DPNDSSRNAS 

SNSSQIISSE QDAINNRFLP TTOEFSERaS 
15 EVGTDINIPIi LASNVARLTE DPIQSHFTST 

IKAYVASHPD IDSRRIYIoAG VSNGGGMTLD 

AAIiKALKGQP MWLIHTRTDK TISADSSVLP 

YNGHWSWIYF LNDQVTGTQN TDNAKNWSGL 

NGQRRR 
20 <212> Type : PRT 

<211> Length : 726 

SequenceName : SEQ ID 217 
SequenceDe script ion : 

25 Sequence 



<213> OrganismName : Streptococcus mutans UAX59 

<400> PreSequenceString : 

MKIFIKKHQQ SILYYSLSFL LPSFIMFLVL 
30 NILHGTDSLF YSFKAGLGFN IFALTSYYLG 

GLSAFYSLGQ lYTKISKSLV LMLSTSYALM 

EKRGIFLYFL TIiTCLFIQNY YFGFMTAIFL 

nTSAFMTiLPT FLDLKSHGEV BTEQISLFSS 

LIiPIjXFAITF FFVKSIKWQV KVAYFLLLAI 
35 FSLVIVIMAA ETLTRIKDIK LKNFYPAFTF 

VSYFIIIiFTF FNQLVSYKVI ISFTLIFTSF 

EIDNYVKKTK KDNLEFFRTE KQIPQTYNDG 

QGNHSTISYP NNTILMDSIiF SIKYNINNQKT 

NHIYKDVKFD SYPLDNQQKF WELTDLNLT 
40 QVYYTVKCPA NSQLYISLPN LTVNNKDENV 

LIFKLSFPKiSr KTVSYDLPHI YALDLTAYQK 

LIYTLPYDKG WFAKQNGKAI KISKAQNGLM 

IFLFVFYQLY YKKFNIiK 

<212> Type : PRT 
45 <211> Length : 857 

SequenceName : SEQ ID 218 
SequenceDe script ion : 

Sequence 

50 

<213> OrganismName : Streptococcus mutans TIA159 
<400> PreSequenceString : 

MKLKHILRIG AVAFASILLL TACGSKTSKK TVTLATVGTT NPFSYEKKGK LTGYDIEVAK 60 
EVFKASDKYD VKYQKTEWTS IFSGLDSDKY QIGANNISYT KERANKYLYS NPTASNPLVL 120 

55 WPKDSDIKS YNDIAGHSTQ WQGNTTVSM LQKFNKNHEN ISTQVKLNFTSE DLAHQIRNVS 180 
DGKYDFKIFE KISAETIIKE QGLDNLKVID LPSDQKPYVY FIFAQDQKDL QKFVNKRLKK 240 
LYENGTLEKL SKKYLGGSYL PDKKDMK 267 
<212> Type : PRT 
<211> Length : 267 

60 SequenceName : SEQ ID 219 

SequenceDescription : 

Sequence 



65 <213> OrganismName ; Streptococcus mutans UA159 
<400> PreSequenceString : 

MRFLVFLIAP FAAFYKPIET ERIDSNTVAV NPDSLILKRF LKTNQLNGIM IVTGPDGKAQ 60 



TKTAEADETV TTEQRQTSKI 
NSANNMSQSD KQNTVNEISS 
TFVNDGNGNW YYLGADGRNV 
EMWTNRFVND QGNWYYLNND 
NSGDLLRKTS RTINGVNYQF 
VTPAWHAGA IVTVTTAGVQRK 
PFIFDSAAFR NNWVNSYTVK 
YGNFNYAAYQ PEAAIGGEKN 
GSGGQKGAYV LVPQSSIPWS 
MGVAY-PNYFA ALVPIAASYS 
FYKELLQAGA QNKWLSYYET 
SGMVATNPTY GGDAKATVNG 



NASSQKVENQ 60 

DSQQTKTDEQ 120 

TGSHTIGGKT 180 

GVPVTGSITV 240 

DNDGNARAID 3 00 

ILNSYVSNAS 3 60 

VDNLQVQADG 420 

.PLIWLHGIG 48 0 

QNQTASLMAL 540 

NQLTDNQITA 6 00 

NVGKHHSGVT 660 

RTYSNVFDWL 720 
726 



FSKNIYWGSS TTILASDGFH 
SFLTPFTYFF NVKNMADAFY 
SFTSSQLELN NWLDVFILLP 
TLWFFTQVSW DIRNRMKRLS 
DIWYFDFPAK SLLGSYDTTK 
IIASFIFQPL DLFWQGMHSP 
LGVGLLATFL FKDYYNYLTQ 
EIALNTFYQI EGIQTDWNFP 
MKFNYNSISQ FSSVKNNLSA 
PHKFGFHLKQ KNlSrKLQLYKN 
LFKEIPIISS VGMQVLDNRV 
FITTNKHTSS YIIDESYYLF 
SIKQLKSQTV KTTTKKNKIF 
KIDVSKGSGK IIMTFVPQGL 



QYVIFDALFR 60 

LFTLIKFGLI 120 

LIMLGLQRLV 18 0 

DFVLVSIFAT 240 

YGSIPTIYIG 3 00 

NMFLHRYSWA 3 60 

VNFIIiTTIFL 420 

SREVYEDNVK 480 

QLLNSLGYYS 540 

FYSLPLALMS 600 

TINGS KGNKA 660 

NLGNYiCKTQT 720 

TTYVAKKRTS 780 

YQGILLTCLG 840 
857 
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VFSNQSKVDG SPVSIKDYFP LASLQKLITG VAIQQLIDKG KLSLNTPLSK YYPQIENSEN 120 

ITIQNLLTHT SGIiADRKEVP QQVLTTQEQQ LDFSLTNYRV TYRKKWKYAN INYALIaAC3-I I 180 

SQISGQNYAT YVRQHFLTAG KGWHFKKYIQ IKDKSKLAAL SVMDQSTTWD KLSKEVTSTF 240 

GAGDYASRPV DYWKFMMAFI NDQFVPVSEY QRSMKMTSKS YYGGLYISQK MLHANGGGFD 3 00 

5 TYSCFAYSNP KTKQVMVIiPI TNGKYKRVKS LAAKAFKLYA DSYALRKNET SK 3 52 



<212> Type : PRT 
<211> Length : 352 

SequenceName : SEQ ID 220 
10 SequenceDescription : 

Sequence 



' <213> Organi'SmtTame : Streptococcus mutans UA1S9 

15 <400> PreSequenceString : 

MKKKIALAAL SFVSAAVLAA CSSAPGGSSD AAGNKIGDTV KIGYNLELSG DVAAYGQAEK 
NGANLAVEEI NKAGGIDGKK IKVISKDNKS DNGEASTIST NLATQSKVNA ILGPATSGAT 
AAAAPNANDA AVPLVTPSGT QDNLTYSKGK VQDYIFRTTF QDSFQGKIIA KYATDNLKAK 
KVALYYDKSS DYAQGIADAF KKAYKGKITV EDTFQAKDQD FQAALTKFKM KDFDAIVXPG 

20 YYTETGLITK QARDMGLTQP ILGPDGFNDE KYVEGAGAAN TNNVHYVSGY STKVALTJSTKA 
EKFLKDYKAK YGEEPNMFAA LAYDSVYMIA DAAKDAKTSK DIATNLAKLK NFKGVTGKMT 
IDKKHNPVKS AVMVGLKDGK EDTATAVEAK 
<212> Type : PRT 
<211> Length : 390 

25 SequenceName : SEQ ID 221 

SequenceDescription : 

Sequence 



30 <213> OrganismName : Streptococcus mutans UA159 

<400> PreSequenceString : 

MKKLSLLLLV CLSLLGLFAC TSKKTADKKL TWATNSIIA DITKNIAGNK WLHSIV^PVG 60 

RDPHEYEPLP EDVKKTSQAD VIFYNGINLE NGGNAWFTKL VKNAHKKTDK' DYFAVSDSVK 120. 

TIYLENAKEK GKEDPHAWLD LKNGIIYAKN IMKRLSEKDP KNKSYYQKNF QAYSAKIiEKIi 180 

35 HKVAKEKISR XPTEKKMIVT SEGCFKYFSK AYDIPSAYIW EINTEEEGTP NQIKALVTCKL 240 

RKSRVSALFV ESSVDDRPMK TVSKDTGIPI AAKXFTDSVA KKGQAGDSYY AMMKWWIDKI 3 00 

ANGIiSQ 3 06 

<212> Type : PRT 

<211> Length : 3 06 
40 SequenceName : SEQ ID 222 

SequenceDescription : 

Sequence 



45 <213> Organi sinName : Streptococcus mutans UA159 
<400> PreSequenceString : 
MFVHTKTKKK RKWQRKVFLL LLLFLLPIVS VLAFIVLFIG GGTAESHDVE ATTGGVKLSA 60 
KQFADKTKLG ISEEEAKNAL AFADRLMSRH HFTAQATAGV LAVGFRESGF DVKAVNUSGG 120 
VAGFFQWSGW GSSVNGDRWK VASKRELTLE VEVDLMSTEL DGRYADWKK VGSATDEKQA 18 0 

50 AKDWSQYYEG VAVSDGQTKA DKIESWATTI CEALKSGGTN YAKVNNTGTS STAIPQGWEN 240 
ISAFDGHAYE GSENYPQGQC TWYVYNRAKQ LGVSFSPYMG NGGQWYQVQG YHSSHTPKAH 3 00 

TALSFVNGQA GSDPTYGHVA FVEAVKDDGS ILISEMNVYG QPAMTVAYRT FDAETAKQFW 360 
YVEGK 365 
<212> Type : PRT 

55 <211> Length : 365 

SequenceName : SEQ ID 223 
SequenceDescription : 

Sequence 

60 

<213> OrganisiriName : Streptococcus mutans UA159 
<400> PreSequenceString : 

MKMKRKLLSL VSVLTILLGA FWVTKIVKAD QVTNYTNTAS ITKSDGTALS NDPSKAVISTYW 60 
EPLSFSNSIT FPDEVSIKAG DTLTIKLPEQ LQFTTALTFD VMHTNGQLAG KATTDPITTGE 120 
65 VTVTFTDIFE KLPNDKAMTL NFNAQLNHNN ISIPGWZSTFN YNNVAYSSYV KDKDITPISP 180 
DVNKVGYQDK SNPGLIHWKV LINNKQGAID NLTLTDWGE DQEIVKDSLV AARLQYXAGD 240 
DVDSLDEAAS RPYAEDFSKN VTYQTNDLGL TTGFTYTIPG SSNNAIFISY TTRLTSSQSA 3 00 



60 
120 
180 
240 
300 
360 
390 
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GKDVSNTIAI SGNNINYSNQ TGYARIESAY GRASSRVKRQ AETTTVTETT TSSSSETTTS 3 60 

EATTETSSTT N13NSTTTETA TSTTGASTTQ TKTTASQTNV PTTTNITTTS KQVTKQKAKF 420 
VLPSTGEQAG LLLTTVGLVI VAVAGVYFYR TRR 453 
<212> Type : PRT 
<211> Length : 453 

SequenceName : SEQ ID 224 

SequenceDescription : 

Sequence 



<213> OrganismKratne r Streptococcus mutans UA159 
<400> PreSequenceString : 

MTFKKLVLGL LSFVAVFTLV ACSSSNSKNIj QDDIKEKKKL WAVSPDYAP FEFKALiVNGK 60 
DTWGADIDIi AKAIAKELGV KLEIiSSMSFD NVLSSLKTGK ADIAISGLSY TKERAQAYDF 12 0. 

SEAYYKTBNA ILIKKSDLNK YTMISSFJaNK TKVAVQKGTI EEGLAKNQLK QSNITSLTSM 18 0 

GEAVNELKSG QVDAIDLEKP VAEGYVSQNS DLVLAKVALK TGEGDAKAVA LPKDSGQIiVK 240 
TVNKVIKKLK KEDKYKQFIS DAVKLTGQQV D 271 
<212> Type : PRT 
<211> Length : 271 

SequenceName : SEQ ID 225 

SequenceDescription : 

Sequence 



<213> OrganismName : Streptococcus mutans XIA159 
<400> PreSequenceString : 

MKKHFFMTFS LLLAAVFLVA CSNLSDSGQR NWDKINKRGM LKIATAGTLY PQSYHDDHNK 60 
LTGYDVEILK EIGKRLGLKV QFTEMGVDGM LTAIKSGQID VANYSLEDGN miSKFLRTS 12 0 

PYKYSFTSMV VRSKDDS6IH SWSDLKGKKA AGAASTNYMK lAKKLGAKLV VYDlsIVTNDVY 18 0 

MKDLVNGRTD VIINDYYLQK lAVAAVKDKY AIKINQGLYA NPYSTSFTLS LKNKVLQKKI 240 
NKAVKDMRKD GTLTKLSKKF FQGEDVTKKH YNSYKKIDIS DVD 283 
<:212> Type : PRT 
<211> Length i 283 

SequenceName r SEQ ID 226 

SequenceDescription r 

Sequence 



60 



<213> OixranisraNarae = Streptococcus pneumoniae R6 
<400> PreSequenceString r 

MKLLKKMMQV ALATFFFGLL GTSTVFADDS EGWQFVQENG RTYYKKGALK ETYWRVIDGK 
YYYFDPLSGE MWGWQYIPA PHKGVTIGPS PRIEIALRPD WFYFGQDGVL QEFVGKQVLE 12 0 

AKTATNTNKH HGEEYDSQAE KRVYYFEDQR SYHTLKTGWI YEEGYWYYLQ KDGGFDSRIN 180 
RLTVGBLARG WVKDYPLTYD BEKLKAAPWY YLDPATGWQN LGNKWYYLRS SGAIVIATGWYQ 240 
EGSTWYYLNA SNGDMKTGWF QVN6NWYYAY DSGALAVNTT VGGYYLNYNG EWVK 294 

<212> Type : PRT 
<211> Length : 294 

SequenceName : SEQ ID 227 

SequenceDescription : 

Sequence 

<213> OrganismName : Streptococcus pneumoniae R6 
<400> PreSequenceString : 

MKLLKKMMQV LLAVFFFGLL ATNTVFANTT GGRFVDKDNR KYYVKDDHKA lYWHKIDGKT 60 
YYFGDIGBMV VGWQYLBIPG TGYRDNLFDN QPVNEIGLQE KWYYFGQDGA LLEQTDKQVL 12 0 

EAKTSENTGK VYGEQYPLSA EKRTYYFDNN YAVKTGWIYE DGNWYYLNKL GNFGDDSYNP 18 0 

LPIGEVAKGW TQDFHVTIDI DRSKPAPWYY LDASGKMLTD WQKVNGKWYY FGSSGSMATG 240 
WKYVRGKWYY LDNKNGDMKT GWQYLGNKWY YLRSSGAMVT GWYQDGLTWY YLliTAGNGDMK 3 00 

TGWFQVNGKW YYAYSSGALA VNTTVDGYSV NYNGEWVQ 3 38 

<212> Type : PRT 
<211> Length : 338 

SequenceName : SEQ ID 228 

SequenceDescription : 



Sequence 
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<213> OrganisTtflSTame : Streptococcus pneumoniae R6 
<400> PreSequenceString : 

MNKKKMILTS LASVAILGAG FVASQPTWR AEESPVASQS KAEKDYDAAK KDAKNAKKAV 60 

5 EDAQKAIiDDA KAAQKKYDED QKKTEEKAAL EKAASEEMDK AVAAVQQAYL AYQQATDKAA 120 

KDAADKMIDE AKKREEEAKT KFNTVRAMW PEPEQLAETK KKSEEAKQKA PELTKKLEEA 180 

KAKLEEAEKK ATEAKQKVDA EEVAPQAKIA ELENQVURLE QELKEIDESE SEDYAKEGFR 240 

APLQSKLDAK KAKLSKLEEL SDKIDELDAE lAKLEDQLKA AEENNNVEDY PKEGLBKTIA 300 

AKKAELEKTE ADLKKAVNEP EKPAPAPETP APEAPAEQPK PAPAPQPAPA PKPEKPAEQP 360 

10 KPEKTDDQQA EEDYARRSEE EYNRIiTQQQP PKAEKPAPAP KTGWKQENGM WYFYNTDGSM 420 

ATGWLQNNGS WYYLNSNGAM ATGWLQYNGS WYYLNANGAM ATGWAKVNGS WYYUSIANGAM 480 

ATGWLQYNGS WYYLNANGAM ATGWAKVNGS WYYLNANGAM ATGWLQYNGS WYYLNANGAM 540 

ATGWAKVNGS WYYLNANGAM ATGWVKD6DT WYYLEASGAM KASQWFKVSD KWYYVNGL6A 600 
^ LAVNTTVDGY KVNANGEWV , . „ ' - ' . 63.9 
15 <212> Type : PRT 

<211> Length : 619 

SequenceNarae : SEQ ID 229 
SequenceDescription : 

20 Sequence 



<213> OrganisinName : Streptococcus pneumoniae R6 
<400> PreSequenceString : 

MKILPFIARG TSYYLKMSVK KLVPFLWGL MLAAGDSVYA YSRGNGSIAR GDDYPAYYKN 60 
25 GSQEIDQWRM YSRQCTSFVA FRLSNVNGPB IPAAYGNANE WGHRARREGY RVDNTPTIGS 12 0 

ITWSTAGTYG HVAWVSNVMG DQIEIEEYNY GYTESYNKRV IKANTMTGFI HFKDLDSGSV 18 0 

GNSQSSASTG GTHYFKTKSA IKTEPLVSAT VIDYYYPGEK VHYDQILEKD GYKWLSYTAY 24 0 

NGSYRYVQLE AVNKNPLGNS VLSSTGGTHY FKIKSAIKTE PLVSATVIDY YYPGEKVHYD 3 00 

QILEKDGYKW LSYTAYNGSR RYIQLEGVTS SQNYQNQSGN ISSYGSNNSS TVGWKKINGS 3 60 

30 WYHFKSNGSK STGWLKDGSS WYYLKLSGEM QTGWLKENGS WYYLGSSGAM KTGWYQVSGE 420 
WYYSYSSGAL AINTTVDGYR VNSDGERV 448 
<212> Type : PRT 
<211> Length : 448 

SequenceName ; SEQ ID 230 
35 SequenceDescription r 

Sequence 



<213> OrganismName : Streptococcus pneumoniae R6 

40 <400> PreSequenceString : 

MFASKSERKV HYSIRKFSIG VASVAVASLV MGSWHATEN EGSTQAATSS NMAKTEHRKA 60 

AKQWDEYIE KMLREIQLDR RKHTQNVALN IKLSAIKTKY LRELNVLEEK SKDELPSEIK 12 0 

AKLDAAFEKF KKDTLKPGEK VAEAKKKVEE AKKKAEDQKE EDRRNYPTNT YKTLELEXAE 180 

FDVKVKEAEL ELVKEEAKES RNEGTIKQAK EKVESKKAEA TRLENIKTDR KKAEEEAKRK 240 

45 ADAKLKEANV ATSDQ6KPKG RAKRGVPGEL ATPDICKENDA KSSDSSVGEE TLPSSSLKSG 3 00 

KKVAEAEKKV EEAEKKAKDQ KEEDRRNYPT NTYKTLDLEI AESDVKVKEA ELELVKEEAK 3 60 

EPRDEEKIKQ AKAKVESKKA EATRLENIKT DRKKAEEEAK RKAAEEDKVK EKPAEQPQPA 42 0 

PATQPEKPAP KPEKPABQPK AEKTDDQQAB EDYARRSEEE YNRLTQQQPP KTEKPAQPST 480 

PKTGWKQEN6 MWYPYNTDGS MATGWLQNNG SWYYLNANGA MATGWLQNNG SWYYLNANGS 540 

50 MATGWLQNNG SWYYLNANGA MATGWLQYNG SWYYLNSNGA MATGWLQYNG SWYYLNANGD 60 0 

MATGWLQNNG SWYYLNANGD MATGWLQYNG SWYYLNANGD MATGWVKDGD TWYYLEASGA 660 

MKASQWPKVS DKWYYVNGSG ALAVNTTVD6 YGVNANGEWV N 701 
<212> Type : PRT 
<211> Length : 701 

55 SequenceName : SEQ ID 231 

SequenceDescription : 

Sequence 



60 <213> OrganismName : Streptococcus pneumoniae R6 
<400> PreSequenceString : 

MKKTTILSLT TAAVILAAYV PNEPILAAYV PNEPILADTP SSEVIKETKV GSIIQQNNIK 60 

YKVLTVEGNI GTVQVGNGVT PVEFEAGQDG KPFTIPTKIT VGDKVFTVTE VASQAFSYYP 120 

DETGRIVYYP SSITIPSSIK KIQKKGPHGS KAKTIIPDKG SQLEKIEDRA PDFSELEEIE 180 

65 LPASLEYIGT SAFSFSQKLK KLTFSSSSKL ELISHEAFAN LSNLEKLTLP KSVKTLGSNL 240 

FRLTTSLKHV DVEEGNBSFA SVDGVLPSKD KTQLIYYPSQ KNDBSYKTPK ETKELASYSP 300 

NKNSYLKKLE LNEGLEKIGT PAFADAIICLE EISLPNSLET lERLAFYGNL ELKELILPDN 360 
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VKNFGKHVMN 6LPKFLTLSG NNINSLPSFF LSGVLDSLKE XHIKNKSTEP SVKKDTFAIP 42 O 

ETVKFYVTSE HIKDVLKSNL STSNDIIVEK VDNIKQETDV AKPKKN-SNQG WGWVKDKGL 48 O 

WYYLNESGSM ATGWVKDKGB WYYLNESGSM ATGWVKDKGL WYYLNESGSM ATGWVKDKGL 54 O 

WYYLNESGSM ATGWVKDKGL WYYLNESGSM ATGWVKDKGL WYYLNESGSM ATGWVKDKGL 60 O 

5 WYYLNESGSM ATGWVKDKGL WYYUSTESGSM ATGWVKDKGL WYYLNESGSM ATGWVKVSGK 66 O 

WYYTYNSGDL LVNTTTPD6Y RVNANGEWVG 69 O 
<212> Type : PRT 
<21l> Length : 690 

SecjuenceHame : SEQ ID 232 
10 SequenceDe script ion : 

Sectuence 



40 



60 



<213> OrganismNaine Streptococcus pneumoniae R6 

15 <400> PreSequenceString : 

MBINVSKLRT DLPQVGVQPY RQVHAHSTGN PHSTVQNEAD YHWRKDPELG FFSHIVGNGC 6 0 

IMQVGPVDNG AWDVGGGWNA ETYAAVELIE SHSTKEEPMT DYRLYIELLR NLADEAGLPK 12 O 

TLDTGSLAGI KTHEYCTNNQ PNNHSDHVDP YPYLAKWGIS REQFKHDIEN GLTIETGWQK 18 O 

NDTGYWYVHS DGSYPKDKFE KINGTWYYFD SSGYMLADRW RKHTDGNWYW FDNSGEMATG 24 O 

20 WKKIADKWYY FNEEGAMKTG WVKYKDTWYY LDAKEGAMVS NAFIQSADGT GWYYLKPDGT 30 O 

LADRPEPTVE PDGLITVK 3 IS 
<212> Type : PRT 
<211> Length : 318 

SequenceName : SEQ ID 233 

25 SequenceDe script ion : 

Sequence 



<213> OrganisiriName : Neisseria meningitidis Z2491 
30 <400> PreSequenceString : 

MTFAYWCXLX AYLLPLFCAA YAKECAGGFRF KDNHNPRDFL ARTQGTAARA HAAQQNGFEA 6 0 

FAPFAAAVLT AHATGNAGQA TVNTLAGLFI LFRLAFIWCY lADKAALRSL MWVGGFVCTV 12 O 

GLFWAA 12-7 
<212> Type t PRT 
35 <211> Length : 127 

Seq[uenceName = SEQ ID 234 
SequenceDescription : 



Sequence 



<213> OrganisraName r Neisseria meningitidis Z2491 
<400> PreSequenceString : 

MNKIYRIIWN SALNAWVAVS ELTRNHTKRA SATVKTAVLA TLLFATVQAN ATDEDEEEEL SO 

ESVQRSWGS IQASMEGSGE LETISLSMTN DSKEFVDPYI WTLKAGDNL KIKQNTNENT 12 O 

45 NASSFTYSLK KDLTGLINVE TEKLSFGANG KKVNIISDTK GLNFAKETAG TNGDTTVHLN ISO 

GIGSTLTDTL AGSSASHVDA GNQSTHYTRA ASIKDVLNAG WNIKGVKTGS TTGQSENVDF 24 O 

VRTYDTVEFL SADTKTTTVN VESKDNGKRT EVKIGAKTSV IKEKDGKLVT GKGKGENGSS 30 O 

TDEGEGLVTA KEVIDAVNKA GWRMKTTTAN GQTGQADKFE TVTSGTNVTF ASGKGTTATV 36 O 

SKDDQGNITV MYDVNVGDAL NVNQLQNSGW NLDSKAVAGS SGKVISGNVS PSKGKMDETV 42 O 

50 NINAGNNIEI SRNGKNIDIA TSMAPQFSSV SLGAGADAPT LSVDDEGALN VGSKDANKPV 48 O 

RITNVAPGVK EGDVTNVAQL KGVAQNLNNR IDNVDGNARA GIAQAIATAG LVQAYLPGKS 54 O 

MMAIGGGTYR GEAGYAIGYS SISDGGNWII KGTASGNSRG HFGASASVGY QW 592 

<212> Type : PRT 
55 <211> Length : 592 

SequenceName : SEQ ID 235 
SequenceDescription : 



Sequence 



<213> OrganismName : Neisseria meningitidis Z2491 
<400> PreSequenceString : 

MLLAEGQKSA VTEYYLNHGT WPSNNSDAGV ASTATDIKGK YVKEVKVEKG VITATMLSSG 6 O 

VNNEIKGKKL SLWAKRQAGS VKWFCGQPVE RAANNAANDA VTAATANGNG KIDTKHLPST 12 O 

65 CRDAASAVCI ETPPTAFYKN T 141 
<212> Type : PRT 
<211> Length : 141 
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SequenceName : SEQ ID 236 
Sec[uenceDescription : 

Sequence 
5 

<213> OrganismName : ITeisseria meningitidis Z2491 
<400> PreSequenceString : 

MKTTDKRTTE THRKAPKTGR IRFSPAYLAI CLSFGILPQA WAGHTYFGIN YQYYRDFAEN 60 

KGKFAVGAKD lEVYNKKGEL VGKSMTKAPM IDFSWSRNG VAALVGDQYI VSVAHNGGYN 12 0 

10 NVDFGAEGRN PDQHRFSYQI VKRNNYKPDN SHPYNGDYHM PRLHKFVTDA EPVEMTSDMR 180 

GNTYSDKEKY PERVRIGS6H HYWRYDDDKH GDLSYSGAWL IGGNTHMQGW GNNGWSLSG 240 

DVRHANDYGP MPIAGAAGDS GSPMFIYDKT NNKWLLNGVL QTGYPYSGRE NGFQLIRKDW 30O 

FYDDIYR6DT HTVFFEPRSN GHFSFTSNNN GTGTVTETNE KVSNPKLKVQ TVRLFDESLN 3 60 
ETDICEPVYAA GGVNQYRPRL NNGENLSFID YGNGICLILSN NINQGAGGLY FSGDFTVSPE ... ■■420 

15 NNETWQGAGV HISEDSTVTW KVNGVANDRL SKIGKGTLHV QAKGENQGSI SVGDGTVILD 480 

QQADDKGKKQ AFSEIGLVSG RGTVQLNADN QFNPDKLYFG FRGGRLDLNG HSLSFHRIQN 540 

TDEGAMIVNH NATTTSTVTI TGNESITQPS GKNINRLNYS KEIAYNGWFG EKDTTKTNGR 60 0 

LNIiVYQPAAE DRTLLLSGGT NLNGNITQTN GKLFFSGRPT PHAYNHLGSG WSKMEGIPQG 660 

EIVWDNDWIN RTFKAENFHI QGGQAVISRM VAKVEGDWHL SNHAQAVFGV APHQSHTICT 72 0 

20 RSDWTGLTNC VEKTITDDKV lASLTKTDlS GNVSLADHAH LNLTGLATLN GNLSANGDTR 780 

, YTVSHNATQN GNLSLVGNAQ ATFNQATLNG NTSASGNASF NLSNNAAQNG SLTLSDNAKA 84 0 

NVSHSALNGN VSIiADKAVFH FENSRFTGQIi SGSKDTALHL KDSEWTLPSG TELGNLNLDN 900 

ATITLNSAYR HDAAGAQTGS VSDTPRRRSR RSLLSVTPPT SVESRFNTLT VNGKLNGQGT 960 

FRFMSELFGY RSDKLKIiAES SEGTYTLAVN NTGNEPVSLD QLTWEGKDN KPLSENLNFT 1020 

25 LQNEHVDAGA WRYQLIRKDG EFRLHNPVKE QELSDKLGKA EAKKQAEKDN AQSLDALIAA 1080 

GRDAAEKTES VAEPARQAGG EMVGIMQAEE EKKRVQADKD SALAKQREAE TRPATTAFPR 1140 

ARRARRDLPQ PQPQPQPQPQ PQRDLISRYA NSGLSEFSAT LNSVFAVQDE LDRVFAEDRR 1200 

NAVWTSGIRD TKHYRSQDFR AYRQQTDLRQ IGMQKNLGSG RVGILFSHNR TENTFDDGIG 1260 

MSARLAHGAV FGQYGIGRFD IGISTGAGFS SGSLSDGIGG KIRRRVLHYG IQARYRAGFG 132 0 

30 GFGIEPYIGA TRYFVQKADY RYENVNIATP GLAFNRYRAG IKADYSFKPA QHISITPYLS 138 0 

LSYTDAASGK VRTRVNTAVL AQDFGKTRSA EWGVNAEIKG FTLSIiHAAAA KGPQIjEAQHS 1440 

AGIKLGYRW 1449 
<212> Type : PRT 
<211> Length, t 1449 

35 SequenceName : SEQ ID 237 

SequenceDescripfcion : 

Sequence 



40 <213> OrganismName : Neisseria meningitidis Z2491 
<400> PreSeqpienceString : 

MNTLQKGFTL lELMIVIAIV GIIiAAVALPA YQDYTARAQV SEAILLAEGQ KSAVTEYYLN 60 
HGEWPSNNXS AGVASSTDIK GKYVQSVEVK NGWTATMAS SNVNNEIKGK KLSLWAKRQD 12 0 

GSVKWFCGQP VKRNDTATTN DDVKADTAAN GKQIDTKHLP STCRDAASAG 170 
45 <212> Type : PRT 

<211> Length : 170 

SequenceName : SEQ ID 238 

SequenceDescription : 

50 Sequence 



<213> OrganismName : Neisseria meningitidis Z2491 
<400> PreSequenceString : 

MQARLLIPIL FSVFILSACG TLTGIPSHGG GKRFAVEQEL VAASARAAVK DMDLQALHGR 60 
55 KVALYIATMG DQGSGSLTGG RYSIDALIRG EYINSPAVRT DYTYPRYETT AETTSGGLTG 120 
LTTSLSTLNA PALSRTQSDG SGSKSSLGLN IGGMGDYRNE TLTTNPRDTA FLSHLVQTVF 18 0 

FLRGIDWSP ANADTDVFIN IDVFGTIRNR TEMHLYNAET LKAQTKLEYF AVDRTNKKLL 24 0 

IKPKTNAFEA AYKENYALWM GPYKVSKGIK PTEGLMVDFS DIQPYGNHMG NSAPSVEADN 3 00 

SHEGYGYSDB AVRRHRQGQP 320 
60 <212> Type : PRT 

<211> Length : 320 

SequenceName : SEQ ID 239 
SequenceDescription : 

65 Sequence 



<213> OrganisxriName : Neisseria meningitidis Z2491 
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<400> PreSequeixceString : 

MRPIFLSFVL FPILITACST PDKSARWENI GTISNGNIHT YINKDSVRKN GNLMIFQDKK 
WTNIiKQERF ANTPAYKTAI AEWEIHCNNK TYRLSSLQLF DTKNTEISTQ NYTASSLRPM 
SILSGTLTEK QYETVCGKKL 
5 <212> Type : PRT 

<211> Length : 140 

SequenceName : SEQ ID 240 

SequenceDescription : 

10 Seq[uence 



<213> OrganismName : Neisseria meningitidis Z2491 
<400> PreSequenceString : 
milCLFITALS ALALSACAGT WEGAKQDTAR MIjDKTQAAAE liAAEQTGNAV BKGWDKTKEA 60 
15 VKKGGNAVGR GISHLGGKIE NATE 84 
<212> Type : PRT 
<211> Length : 84 

SequenceName : SEQ ID 241 

SequenceDescription : 

20 

Sequence 



<213> OrganismName : Neisseria meningitidis Z2491 

<400> PreSequenceString : 
25 MKLLFIPLVL FVAVEHPYIA WLEMTQIPSE KAAETFKLPY EFMEQNRVQT LFGNQGLYNG 60 

FLGIGLWSR FAAPDNAVYG ATVLFLGFVL lAAAWGAFSS GNKGILVKQG LPAFLAAAAV 120 

LAV 123 

<212> Type : PRT 

<211> Length : 123 
30 SequenceName : SEQ ID 242 

SeqtuenceDescription : 

Sequence 



35 <213> OrganismName : Neisseria meningitidis Z2491 
<400> PreSequenceString : 
MASSNVNNEI KDKKLSLWAK RQDGSVKP?FC GQPVKRDAAT DADVTADSGN EIDTKHLPST 60 
CRDAASAVCT KTPEYYPNHG EWPKNFVIPA QAGIQVCRHG NLSGKKVSPV LSSRFPLSWE 120 

40 <212> Type r PRT 

<211> Length : 120 

SequenceName : SEQ ID 243 
SequenceDescription : 

45 Sequence 



<213> OrganismName : Neisseria meningitidis Z2491 

<400> PreSequenceString : 

MLLAEGQKSA VTEYYLNHGE WPSNNTSAGV ATSTDIKGKY VQSVEVKNGV VTATMASSNV 60 
50 NNEIKGKKLS LWAKRQDGSV KWFCGQPVKR NDTATTNDDV KADTAANGKQ IDTKHLPSTA 120 

STRKSTPN 128 

<212> Type : PRT 

<211> Length : 128 

SequenceName : SEQ ID 244 
55 SequenceDescription : 

Sequence 



<213> OrganismName : Neisseria meningitidis Z2491 

60 <400> PreSequenceString : 

MPIPFKPVLA AAAIAQAFPA FAADPAPQSA QTLNEITVTG THKTQKLGEE KIRRKTLDKL 60 

LVNDEHDLVR YDPGISWBG GRAGSNGPTI RGVDKDRVAI NVDGLAQAES RSSEAFQELF 120 

GAYGNFNANR NTSEPENFSB VTITKGADSL KSGSGALGGA VNYQTKSASD YVSEDKPYHL 180 

GIKG6SVGKN SQKFSSITAA GRLFGLDALL VYTRRPGKET KNRSTEGDIE IKNDGYVYNP 240 

65 TDTGGPSKYL TYVATGVARS QPDPQEWVNK STLFKLGYNF NDQNRIGWIF EDSRTDRFTN 300 

ELSNLWTGTT TSAATGDYRH RQDVSYRRRS GVEYKNELEH GPWDSLKLRY DKQRIDMNTW 360 

TWDIPKNYDK RGINGEVYHS FRHIRQNTAQ WTADFEKQLD FSKAVWAAQY GLGGGKGDNA 420 



60 
120 
140 
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NSDYSYFAKL YDPKIIASNQ AKITMLIENR SKYKFAYWNN AFHLGGNDRF RLNAGIRYDK 480 
NSSSAKDDPK YTTAIRGQIP HLGSEBiAHAG PSYGT6PDWR FTKHLHLLAK YSTGPRAPTS 540 
DETWLLFPHP DFYLKAKTPNL KAEKAKfiTWEIi GLAGSGKAGKT FKLSGFKTKY RDFIELTYMG 600 
VSSDDKNNPR YAPLSDGTAL VSSPVWQNQN RSAAWVKGIE PNGTWNLDSI GLPKGLHTGL 660 
5 NVSYIKGKAT QNNGKETPIN ALSPWTAVYS LGYDAPSKRW GIITAYATRTA AKKPSDTVHS 720 
NDDLNNPWPY AKHSKAYTLF DLSAYLNIGK QVTLRAAAYN ITNKQYYTWE SLRSIREFGT 780 
VNRVDNKTHA GIQRFTSPGR SYNFTIEAKF 810 
<212> Type : PRT 
<211> Length : 810 
10 SequenceKTame : SEQ ID 245 

SequenceDescription : 

Sequence 



15 <213> OrganistnName : Neisseria meningitidis Z2491 
<400> PreSequenceString : 

MKKSLIALTL AALPVAAMAD VTLYGTIKTG VETSRSVEHN GGQWSVBTG TGIVDLGSKI 60 
GFKGQEDLGN GLKAIWQVEQ KASIAGTDSG WGNRQSFIGL KGGFGKLRVG RIiNSVIiKDTG 120 
DINPWDSKSD YLGVNKIAEP ET^LISVRYD SPEFAGLSGS VQYAIiNDNVG RHNSESYHAG 180 

20 FNYKNGGFFV QYGGAYKRHQ DVDDVKIEKY QIHRLVSGYD ISIDALYASVAV QQQDAKLVED 240 
NSHNSQTEVA ATLAYRFGNV TPRVSYAHGF KGSVDDAKRD NTYDQVWGA EYDFSKRTSA 300 
LVSAGWIiQEG KGENKFVATA GGVGLRHKF 329 
<212> Type : PRT 
<211> Length : 329 

25 SequenceKTame : SEQ ID 246 

SequenceDescription : 

Sequence 



30 <213> OrganistiiName r Neisseria meningitidis Z2491 
<400> PreSequenceString : 

MKTLLLLXPL VLTACGTL-TG XPAHGGGKRF AVEQBLVAAS SRAAVKEMDL SALKGRK2^AL 60 
YVSVMGDQGS GNISGGRYSI DALIRGGYHN KPESATQYSY PAYDTTATTK SDALSSVTTS 120 
TSLLNAFAAA LTKNSCTIKGE RSAGLSVNGT GDYRNETLLA NPRDVSPLTN LIQTVPYLR6 180 

35 XEWPPEYAD TDVFVTVDVF GTVRSRTELH LYWAETLKAQ TKLEYFAVDR DSRKLLIAPK 240 
TAAYBSQYQB QYALWMGPYS VGKTVKASDR LMVDFSDITP YGDTTAQNRP DFKQNNGKKP 300 
DVGNEVTRHR KGG 33.3 
<212> Type r PRT 
<211> Length r 313 

40 SequenceName : SEQ ID 247 

SequenceDescription : 

Sequence 



45 <213> OrganismName : Neisseria meningitidis Z2491 
<400> PreSequenceString : 

MNKTLSILPV AILLGGCAAG GGNTFGSLDG 6TGMGGSIVK MAVESQCRAE LNKRSEWRLT 60 
AIiAMSABKQA EWENKICACV AQEAPNQLTG NDVMQMLDPS TRNQALAALT AKTVSACFKH 120 
LYR - 123 

50 <212> Type : PRT 

<211> Length : 123 

SequenceName : SEQ ID 248 

SequenceDescription : 

55 Sequence 



<213> OrganismName : Neisseria meningitidis Z2491 
<400> PreSequenceString : 

MNPLIHQAKE SSMQTRILSA VLLAFSTAAP AGGAFTLQFD NPSEDGGFTQ NQILSAPYGF 60 
60 GCSGGNASPA LSWKNPPAGT KSFVLTVYDK DAPTGLGWMH WWADIPADV RRRNA^SLQL 120 

SRCASIADDQ SAAISAVISL QICRIRLTPS YTAKPMPSCC NHANTPQSAA SAALCGTSSS 180 

VSTAAA 186 

<212> Type : PRT 

<211> Length : 186 
65 SequenceName : SEQ ID 249 

SequenceDescription : 
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Sequence 



<213> OrganismUame : Neisseria meningitidis Z2491 
<400> PreSequenceString : 

5 MNKTLKRRVF RHTALYAAIL MFSHTGGGGG AMAQTRQYAI IMNERNQPEV QWNGSYSIKD 60 

KDRKREYTHH NHQQGGSSVS FNNSDELVSR QSGTAVFGTA TYLPPYGKVS GFDAAALKER 120 

NNAVDWIHTT HP6LIGYSYD GWCRSATDC PKLVYKTRFS PDNPDLAKTG GGLDKHTEPS 180 

RDNSPIYKLK DHPWLGVSFN LGAEGIAKN6 KTINKLVSSF NEKNSNNNLV YTTEGRDISL 240 

GNWQRETTAM AYYLNAKLHL LDKKQIQNIT DKTVQLGVLK PSIDVRTRNT GTAGILSYWA 3 00 

10 KWDIKDTGQI PVKLSLTQVK AGRCVNKDNP NKNTKTSSPA LTAPALWFGA GQDGKAEMYS 3 60 

ASVSTYPDSS SSRIFLQNIiK RKTDTSRPGR YSLATLNKSD lESREPSFTS RQTVIRLDGG 420 

VQQIKLDR3SIN TBVTGFNGND GKNDTFGIVS E6SFMPDASE WKKVLLPWTV RAFKTYDGRFN 480 

TVNKEENNGK PKYSQKYRSR KTMGKHERNLG DIVNSPIVAV GEYLATSAND GMVHIFKQSG 540 

GDKRSYHLKL SYIPGTMPRK DIESKDSTLA KELR^FAEKG Y^/GDRYGVDG CPVTaR-Rll'DD 600. 

15 QDKQKHFFMP GAMGLGGRGA YALDLTKADD NDPTKASLFD VKDNGNNGNN GNNRVELGYT 660 

VGTPQI6KTH NGKYAAFLAS GYATKQIDSG ENKTAIiYVYD LESNNGTLIR KIEVTDGKGG 720 

LSSPTLVDKD LDGTVDIAYA GDRGGKMYRF DIiSGNNPNSW TVRTIFQGTK PITSAPAISQ 780 

LKDKRWIFG TGSDLSEDDV LSTDEQHIYG IFDNDTNTGT AQEGLGKGLL EQKLSEENKT 840 

LPLTDYKRSD GSGDKGWWK LKDGQRVTVK PTWLRTAFV TIHKYTGWDK CGAETAILGI 900 

20 NTADGGKLTK KSARPIVPAA NSKVAQYSGD KKTSSGKSIP IGCMBKDG6T VCPNGYVYDK 960 

PVNVRYLDEK KTDGFSTTAD GDAGGSGTPK EGKKPARNNR CFSGKGVRTIj LMNDLDSLDI 1020 

TGPMCGMKRI SWREVFY 103 7 
<212> Type : PRT 
<211> Length : 1037 

25 SequenceName : SEQ ID 250 

SequenceDescription : 



Sec[uence 



30 <213> OrganisttiName : Neisseria meningitidis Z2491 
<400> PreSequenceString : 

MKHPKLTLXA AIiLTTAATAA PLPWTSFSI LGDVAKQIGG ERVSIQSLVG AMQDTHAYHM 60 
TSGDIKKIRS AKLVIiINGLG LEAADIQRAV KQSKVSYAEA TKGIQPLKAE EEGGHHHDHD 120 
HDHDHDHEGH HHDHGEYDPH VWNDPVLMSA YAQNVAEALI KADPEGKVYY QQRLGNYQMQ 180 

35 LKKLHSDAQA AFNAVPAAKR KVLTGHDAFS YMGKRYHXEF XAPQGVSSEA EPSAKQVAAI 240 
IRQXKREGIK AVFTENXKDT RMVDRIAKET GVlSFVSGKIiYS DALGNAPADT YIGMYRHNIK 300 
ALTNAMKQ 308 
<212> Type : PRT 
<211> Length. : 308 

40 SequenceName : SEQ ID 251 

SequenceDescription : 

Sequence 



45 <213> OrganismWame : Streptococcus pyogenes MGAS8232 
<400> PreSequenceString : 

MKKRILSAVL VSGVTLGAAT TVGAEDLSTK lAKQDSIXSN LTTEQKAAQN QVSALQAQVS 6 0 

SLQSEQDKLT ARNTELEALS KRFEQEXKAL TSQXVARNEK LKNQARSAYK NWETSGYINA 120 
LLNSKSXSDV WRLVAXNRA VSANAKLLEQ QKADKVSLEE KQAANQTAXN TIAANMAMAE 180 

50 ENQNTLRTQQ ANLEAATANL ALQLASATED KANLVAQKEA AEKAAAEALA QEQAAKVKAQ 240 
EQAAQQAASV EAAKSAITPA PQATPAAQSS NAIEPAALTA PAAPSARPQT SYDSSNTYPV 3 00 

GQCTWGAKSL APWAGNNWGN GGQWAYSAQA AGYRTGSTPM VGAIAVWNDG GYGHVAVWE 360 
VQSASSIRVM ESNYSGRQYI ADHRGWFNPT GVTFIYPH 398 
<212> Type : PRT 

55 <211> Length : 398 

SequenceName : SEQ ID 252 
SequenceDescription : 

Sequence 
60 

<213> OrganismName : Streptococcus pyogenes MGAS8232 
<400> PreSequenceString : 

MITIKNPKIL KWLKYVLSAI LSLIILVIII GGLLFTFYIS SAPKLSEAQL KSTNSSLVYD 60 
GNNNLIADLG SBKRENVTAD SIPINLVNAI TSIEDKRPFN HRGVDLYRIF GAAFHNLTSQ 120 
65 TTQGGSTLDQ QLIKLAYFST NESDQTLKRK AQEVWLALQM ERKYTKQEIL TFYINKVYMG 180 
NGNYGMLTAA KSYYGKDLKD LSYAQLALLA GIPQAPSQYD PYLHPEAAQN RRNWLQQMY 240 
MEKHLTKAEY ETAIATPVAE GLQSLQQRST YPKYMDNYLK QVIEEVKKBT NKDIFTAGLK 300 
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VYTHIIPDAQ QTLYNIYHSG DYVYYPDQDF QVASTIVDVT KGHVIAQLGG RNQDENVSFG 3 60 

TNQAVLTDRD WGSTMKPITA YAPAIESGVY TSTAQSTNDS VYYWPGTTTQ LFNWDLRYNG 420 

WMTIQAAIML SRNVPAVRAL EAAGLDYARS FLSSLGXKTYP EMHYSNAISS NWSSSDKKYG 480 

ASSBKMAAAY AAPANGGIYH KPRYVNKVEP SD6TSKTFDE KGKRAMKETT AYMMTDMLKT 540 

5 VLTYGTGTAA AIPGVAQAGK TGTSNYTDEE LAKIGEKYGIj YPDYVGTLAP DENFVGFTICR 600 

YAMAVWTGYK NRLTPVYGSS LEIASDVYRS MMTYLTNGYS EDWTMPNGLY RSGGFLYLSG 660 

TYASNTDYTKT SVYNNLYSNN TTTASSQTTS DDTSSSNDTS NSTNTDNNGS HPSTDDKKTT 720 

H 721 
<212> Type : PRT 
10 <211> Length r 721 

SequenceNarae : SEQ ID 253 

SequenceDescription : 



Sequence 
15 

<213> OrganismName : Streptococcus pyogenes MGAS823 2 
<400> PreSequenceString : 

MIITKKSIiFV TSVALSIiAPIi VTAQAQEWTP RSVTEIKSEL VLVDNVFTYT VKYGDTLSTI 60 
AEAMGIDVHV LGDINHIANI DLIFPDTILT ANYNQHGQAT TLTVQAPASS PASVSHVPSS 12 0 

20 EPLPQASATS QSTVPMAPSA TPSDVPTTPL ASAKPDSFVT ASSEIiTSSTN DVSTELSSES 18 0 

QKQPEVSQEA VPTPKAAETT EVEPKTDISE DPTSANRPVP NESASEEASS AAPAQAPAEK 240 
EETSQMLTAP AAQKAVADTT SVATSNGLSY APNHAYNPMN AGLQPQTAAF KEEVASAFGI 3 00 

TSFSGYRPGD PGDHGKGLAI DFMVPVSSTL GDQVAQYAID HMAERGISYV IWKQRFYAPF 360 
ASIYGPAYTW NPMPDRGSIT ENHYDHVHVS FNA 393 

25 <212> Type : PRT 

<211> Length : 393 

SequenceKTame : SEQ ID 254 
SequenceDescription : 

30 Sequence 



<213> Organistriilame r Streptococcus pyogenes MGAS8232 
<400> PreSequenceString t 

MKKKILLMMS LISVFFAWQL TQAKQVLAEG KVKWTTFYP VYEFTKGVIG NDGDVSMLMK 60 
35 AGTBPHDFEP STKDIKKIQD ADAFVYMDDN METWVSDVKK SLTSKKVTIV KGTGNMLLVA 120 
GAGHDHHHED ADKKHEHNKH SEBGHNHAFD PHVWLSPYRS ITWENIRDS LSKAYPEKAE 180 
NFKANAATYI EKLKBLDKDY TAALSDAKQK SFVTQHAAFG YMALDYGLNQ ISINGVTPDA 240 
BPSAKRIATL SKYVKKYGIK YIYFEENASS KVAKTLAKEA GVKAAVLSPL EGLTKKEMKA 3 00 

GQDYFTVMRK NLETLRLTTD VAGKEILPEK DTTKTVYNGY FKDKBVKDRQ LSDWSGSWQS 3 60 

40 VYPYLQDGTL DQVWDYKAKK SKGKMTAAEY KDYYTTGYKT DVEQIKINGK KKTMTFVRNG 420 
EKKTFTYTYA GKEILTYPKG NRGVRFMFEA KEPNAGEFKY VQFSDHAIAP EKAEHFHLYW 480 
GGDSQEKLHK ELEHWPTYYG SDLSGREIAQ EINAH 515 
<212> Type : PRT 
<211> Length : 515 
45 SequenceName : SEQ ID 255 

SequenceDescription : 

Sequence 



50 <213> OrganismName : Streptococcus pyogenes MGAS823 2 
<400> PreSequenceString : 

MKKFHRPLVS GVILL6PNGL VPTMPSTLIS QQENLVHAAV LGDNYPSKWK KGNGIDSVJNM 60 
YIRQCTSFAA FRLSSANGFQ LPKGYGNACT WGHIAKNQGY PVNKTPSIGA lAWFDKNAYQ 120 
SNAAYDHVAW VADIRGDTVT lEEYNYNAGQ GPERYHKRQI PKSQVSGYIH FKDLSSQTSH 180 

55 SYPRQLKHIS QASFDPSGTY HFTTRLPVKG QTSIDSPDLA YYEAGQSVYY DKWTAGGYT 240 
WLSYLSFS6N RRYIPIKEPA QSWQNDNTK PSIKVGDTVT FPGVFRVDQL VNNLIVNKEL 3 00 

AGGDPTPLNW IDPTPLDBTD NQGKVLGNQI LRVGBYFTVT GSYKVLKIDQ PSNGIYVQIG 360 
SRGTWVNADK ANKL 374 
<212> Type : PRT 

60 <211> Length : 374 

SequenceName : SEQ ID 256 
SequenceDescription : 

Sequence 

65 

<213> OrganismName : Streptococcus pyogenes MGAS8232 
<400> PreSequenceString : 
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MLKFTSNILA TSVAETTQVA PGGCCCCCTT CCPSIATGSG NSQGGSGSYT PGK 



<212> Type : PRT 
<211> Length : 53 
5 SequenceName : SEQ ID 257 

Sec[uenceDe script ion : 



Sequence 



10 <213> OrganisrriName : Streptococci 
<400> PreSequenceString : 
MGESYSVEAV LTAVDKTFGK TLQSAIRSIE 
QAISAMTRTV SSGLGSML6E MNSSAKAWKT 
YSASDMASTY AQL_»JVVGVKD TGKLVKAFGG 

15 QDFRIMLEQT PAGMAKVAKS MGKNIiDELVA 
EFKTVDQAID GMREGLSNKL QPAFEKVNQF 
INIDKIVSNI SSAVSSVTSK VKEFWDGFKQ 
KTFGATVGGI VKHVSNPAKA VSDVLGKMDP 
SFLDKIGSKF GLFGNKAKEG TDKASNGARR 

20 GVGIKTALSG IPPYH 
<212> Type : PRT 
<211> Length : 495 

SequenceName : SEQ ID 258 
SequenceDescription : 

25 

Sequence 



IS pyogenes MGAS8232 

GLEKRSTGFS SVSQKASSMP KSMLGAJNLAG 60 

FDANLADIGF GKKQILAVKT AMQDYATKTI 120 

LAASAENPKQ AMKSISQQMT QAVGRPTVilW 180 

DIQAGRVKTS DFLEAVKKAG NDKSFQKMAT 240 

GIRAIEAIGK QLDKVDFSKP ASNL6KFLEG 3 00 

TGAISAFSGA LQSVWGALKN VASAMSGGNW 360 

GRLRSWIATF AAVAGGFKLF EKLTGQSVIG 420 

SGGIISQIFS GLGNIVKSAG TAISTAAKGI 4 80 

495 



<213> OrganismName : Streptococcus pyogenes MGASB232 
<400> PreSequenceString : 

30 MKKGFFLMVM WSLVMIAGC DKSANPKQPT QGMSWTSFY PMYAMTKEVS GDLNDVRMIQ 60 
SGAGIHSFEP SVNDVAAIYD ADIiFVYHSHT LEAWARDLDP NLKKSKVDVF EASKPLTLDR 120 
VKGLEDMEVT QGIDPATLYD PHTWTDPVLA GEBAVNIAKE LGRLDPKHKD SYTKNAKAFK 180 
KEAEQLTEEY TQKFKKVRSK TFVTQHTAFS YLAKRFGLKQ LGISGISPBQ EPSPRQLKEI 240 
QDFVKEYNVK TIFAEDNVNP KIAHAXAKST GAKVKTLSPL EAAPSGNKTY LENLRANLEV SOO 

35 LYQQLK 3 06 

<212> Type r PRT 
<211> Length : 306 

SecpienceName : SEQ ID 259 
SequenceDescription : 

40 

Sequence 



<213> OrganismName : Streptococci 
<400> PreSequenceString : 

45 MEKKQRFSLR KYKSGTFSVL IGSVFLMMTT 
LSSAESKSQD TSQITPKTNR EKEQPQGLVS 
PVNTDVHDWV KTKGAWDKGY KGQGKWAVI 
QKAAGINYGS WINDKWFAH NYVENSDNIK 
YRPQSTQAPK ETVIKTEETD GSHDIDWTQT 

50 FLGIAPEAQV MFMRVFANDV MGSAESLFIK 
LMEAIEKAKK AGVSVWAAG NERVYGSDHD 
WVIQRLMTVK ELENRADLNH GKAIYSESVD 
DVKDKIALIE RDPNKTYDEM lALAKKHGAL 
SHEFGKJyyiSQ LNGNGTGSLE FDSWSKAPS 

55 IYSTYND3SIHY GSQTGTSMAS PQIAGASLLV 
HVNPETKTTT SPRQQGAGLL NIDGAVTSGL 
NKDKTLRYDT ELLTDHVDPQ KGRFTLTSRS 
ELTKQMSNGY YLEGFVRFRD SQDDQLNRVN 
FYFDESGPKD DIYV6KHFTG LVTLGSETNV 

60 GNPVLAISPN GDNNQDFAAF KGVFLRKYQG 
NSDIRFAKST TLLGTAFSGK SLTGAELPDG 
PVLSQATFDP ETNRPKPEPL KDRGLAGVRK 
KTFVERQADG SFILPLDKAK LGDFYYMNTBD 
NYQTKETLKD NLEMTQSDTG LVTNQAQIiAV 

65 AFKGLKNNVY NDLTVNVYAK DDHQKQTPIW 
YQYWTYRDB HGKEHQKQYT ISVNDKKPMI 
EVFYLAKKMG RKFDVTEGKD GITVSDNKMY 



IS pyogenes MGAS8232 

TVAADELSTM SEPTITNHTQ QQAQHLTNTE 60 

EPTTTELADT DAAPMANTGP DATQKSASLP 12 0 

DTGIDPAHQS MRISDVSTAK VKSKEDMLAR 18 0 

ENQFEDFDED WENFEFDAEA EPKAIKKHKI 24 0 

DDDTKYESHG MHVTGIVAGN SKEAAATGER 3 00 

AIEDAVALGA DVINLSLGTA NGAQLSGSKP 3 60 

DPLAINPDYG LVGSPSTGRT PTSVAAINSK 420 

FKNIKDSLGY DKSHQFAYVK ESTDAGYKAQ 48 0 

GVLIPNNKPG QSNRSMRLTA NGMGIPSAFI 540 

QKGNEMNHFS NWGLTSDGYL KPDITAPG6D 600 

KQYLEKTQPN LPKEKIADIV KNLLMSNAQI 660 

YVTGKDNYGS ISLGNITDTM TFDVTVHNLS 720 

LKTYQGGEVT VPANGKVTVR VTMDVSQPTK 780 

IPFVGFKGQF ENLAVAEESI YRLKSQGKTG 840 

STKTISDNGL HTLGTFKNAD GKFILEKNAQ 900 

LKASVYHASD KEHKNPLWVS PESFKGDKNF 960 

YYHYWSYYP DWGAKRQEM TFDMILDRQK 1020 

DSVFYLERKD NKPYTVTIND SYKYVSVEDN 1080 

FAGNVAIAKL GDHLPQTLGK TPIKLKLTDG 1140 

VHRNQPQSQL TKMNQDFPIS PNEDGNKDFV 1200 

SSQAGASASA lESTAWYGIT ARGSKVMPGD 1260 

TQGRFDTING VDHFTPDKTK ALGSSGIVRE 1320 

IPKNPDGSYT ISKRDGVTLS DYYYLVEDRA 1380 
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GNVSFATLRD LKAVGKDKAV VNFGLDLPVP EDKQIVNFTY LVRDADGKPn ENBEYY3SINSG 1440 

NSLILPYGKY TVELIiTYDTN AAKL.ESDKIV SFTLSADNNF QQVTFKMTML* ATSQITAHFD 1500 

HLLPEGSRVS LKTAQGQLIP LEQSLYVPKA YGKTVQEGTY EVVVSLPKCY" RIEGNTKVNT 1560 

IiPNEVHELSIi RLVKVGDASD STGDHKVMSK NNSQALTAFA TPTKTTTSATr AKAIiPSAGEK 1620 

5 MGLKLRIVGL VLLGLTCVFS RKKSTKD 1647 



<212> Type : PRT 

<211> Length : 1647 

SequenceNarae : SEQ ID 260 
SequenceDescription : 

10 

Sec[uence 



<213> OrganistnName : Treponema pallidum subsp. pallidrtm str. Nicliols 

<400>. PreSequenceString ... - . . 

15 MMRSLFSGVS GMQNHQTRMD VIGNNVANVN TTGFKRGRVN FQDLISQQLS - AAARPliTEEVG 60 
GVNPKEVGLG VLIASIDTVH TQGALQTTGI NTDVSIQGSG FFVLKSGEK17 FFTRAGAPGV 120 
DNAGTLVNPA NGMRVQGWMA QDVAGERLIN SSAQTQDLVI PIGQKIDAQC2 TSTVHYACNL 180 
DKRIiPELAAD ANEADVRKST WTTDFQVYDS PGQQHTLQIN FSRVPGTNNC2 WQATVAVDPG 240 
TEVDTQTRVG VGTSDGAANT FIVNFDNFGH LASVTDTAGN VTGPTGQVLL. EASYDWGAN 3 00 

20 PDDAGQVTRH AFTLMLGEIG TARNTITQFA ERSTTKAYRQ DGYAMGYLEISr FKIDQSGVIT 3 60 

GVYSNGVSQD IGQLAXiAGFA NQGGLEKAGE NTYVOSISnsTSG lANISTSGVM GKGKLIAGTL 420 
EMSNVDLTDQ FTDMIITQKG FQAGAKTIQT SDTMLDTVLS LKR 463 
<212> Type : PRT 
<211> Length : 463 

25 SequenceName : SEQ ID 261 

SequenceDescription : 

Sequence 



30 <213> OrganismName : Treponema pallidum subsp. pallidutm str. Nichols 



<400> PreSequenceString : 

MGCMRWGSVL CVWGVGASG GVLGQEFSPK LTGSATLEWG ISYGKGVGSE^ GQAPGAVMGT 60 

GPYNLKHGFR TTNTVGVSFP LVMRTTHTRR GQHPALYAEL KVADLQADLS QGKAGFAVKR 120 

KGKVEATLHC YGAYLTXGKIT PTFLTNFARL WKPWVTAQYQ EDAVQYAPG^ GGLGGKVGYR 180 

35 AQDIGGSGVS LDVGFLSFAS NGAWDSTDPT HSKYGFGADL KLMYARAGHF^ LCTVELASNV 240 

TLEDGYLIGA QKDANNQNKD KLLWNVGGRL TLEPGAGFRF SFALDAGNQEai QSAQDFQNRT 3 00 

QRAQSELTAL SNNLFQGESQ KQEAWVTQW QQATQTVTAG VRSALESRGT? TYINALEAVQ 3 60 

PNPAKPTGKV VQNLHTPQGS PPNLPPLPAL PAFSLMGQVL LQYDAEQWK GFEQVQTQIV 42 0 

TEINQKVQAA VAKWNANMQA VGGSL6DTAR MVGEADIKQQ LSRKQNSILT7 MVSVQDEVKQ 480 

40 DLADLVPMMR TEITAFFASV QQHITEEVKK KTDALNAGQQ IRQAIQNLRft*. SAWRAFLMGV 540 

SAVCLYLDTY NVAFDALFTA QWKWLSSGIY FATAPANVFG TRVLDNTIAS CGDFAGFLKL 600 

ETKSGDPYTH LLTGLDAGVE TRVYIPLTHD LYKNNNGNPL PSGGSSGHIS LPWGKAWCS 660 

YRIPVQDYGW VKPSVTVHAS TNRAHLNAPA AGGAVGATYL TKEYCAQLRJ^ GISASLXEKT 72 0 

VFSLDWEQGM LSDVPYLLVS ECLTQGIGRI VCGVTLSW 758 

45 <212> Type : PRT 

<211> Length : 758 

SequenceName : SEQ ID 262 
Sec[uenceDescription : 



50 Sequence 



<213> OrganismName : Treponema pallidum subsp. pallid\_zim str. Nichols 



<400> PreSequenceString : 

MGRQVMQAGV LAGMVCAASG YAGVLTPQVS GTAQLQWGIA FQKNPRTGPG. KHTHGFRTTN 60 

55 SLTISLPLVS KHTHTRRGEA RSGVWAQLQL KDLAVELASS KSSTALSFTK PTASFQATLH 120 

CYGAYLTVGT SPSCWNPAQ LWKPFVTRAY SEKDTRYAPG FSGSGAKLxGh^ QAHNVGNSGV ISO 

DVDIGFLSPL SNGAWDSTDT THSKYGPQAD ATLSYGVDRQ RLLTLBLAGbJ ATLDQNYVKG 240 

TEDSKNENKT ALLWGVGGRL TLEPGAGFRF SFALDAGNQH QSNAHAQTQS RAILKAREVF 300 

RRVEGKLVQN LPNIMMPPGI TEQTTLIEMV GLAALIAEGT LGSAIQTVL?^ AGALAALVSQ 360 

60 LVPNIEQGVR DVFRSSDPRV VTAKLLAFLE RAPMNALNID ALLRMQWKWnj SSGIYFATAG 420 

TNIFGKRVFA TTRAHYFDPA GFLKLETKSG DPYTHLLTGL NAGVEARVYIZ PLTYIRYRNN 480 

GGYELNGAVP PGTINMPILG KAWCSYRIPL GSHAWLAPHT SVLGTTNRFIST IINPAGNLLN 540 

ERALQYQVGL TFSPFEKVEL SAQWEQGVLA DAPYMGIAES IWSERHFGTIIj VCGMKVTW 598 



65 <212> Type : PRT 

<211> Length : 598 

Sec[uenceName : SEQ ID 263 
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SequenceDescription : 
Sequence 

5 <213> OrganisiriName : Treponema pallidum subsp. pallidum str. Nichols 
<400> PreSequenceString : 

MGRQVMQAGV LAGMVCAASG YAGVLTPQVS GTAQLQWGIA FQKNPRTGPG KHTHGPRTTN 60 

SIiTISLPLVS KHTHTRRGEA RSGVWAQLQL KDLAVELASS KSSTALSFTK PTASFQATLH 120 

CYGAYliTVGT SPSCWNFAQ LWKPFVTRAY SEKDTRYAPG FSGSGAKLGY QAHNVGNSGV 180 

10 DVDIGFLSFL SNGAWDSTDT THSKYGFGAD ATLSYGVDRQ RLLTLELAOKT ATLDQNYVKG 240 

TEDSKNENKT ALLWGVGGRL TLEPGAGFRF SFALDAGNQH QSNAHAQTQE RAILKAREVF 3 00 

RRVEGKLVQN LPNIMMPPGI TEQTTLIEMV GLAALIAEGT LGSAIQTVLA AGALAALVSQ 360 

LVPNIEQGVR DVFRSSDPRV VTAKLIiAFLE RAPMNALNID ALLRMQWKWL SSGIYFATAG 420 
TOIFGKRVFA TTRAtFYFDFA GFLKLETKSG DPYTHLLTGL ^NAGVEARVYI PLTYIRYRNN . . 480 . 

15 GGYELNGAVP PGTINMPILG KAWCSYRXPL GSHAWLAPHT SVLGTINRFN IINPAGNLLN ' 540 

ERALQYQVGIi TFSPFEKVEL SAQWEQGVLA DAPYMGIAES IWSERHFGTIi VCGMKVTW 598 

<212> Type : PRT 
<211> Length : 598 
20 SeccuenceName : SEQ ID 264 

SequenceDescription : 

Sequence 



55 



25 <213> OrganistnName : SARS coronavirus Frankfurt 1 
<400> PreSequenceString : 

MFIFLLFLTL TSGSDLDRCT TFDDVQAPNY TQHTSSMRGV YYPDEIFRSD TIiYLTQDLFL 60 

PFYSNVTGFH TINHTFGNPV IPFKDGIYFA ATEKSNWRG WVFGSTMNNK SQSVIIINHS 120 

TNWIRACWF ELCDNPFFAV SKPMGTQTHT MIFDNAFNCT FBYISDAFSL DVSEKSGNFK 180 

30 HLREPVFKNK DGFIxYVYKGY QPIDWRDLP SGFNTLKPIF KLPLGINITN FRAILTAFSP 240 

AQDIWGTSAA AYFVGYIiKPT TFMLKYDENG TITDAVDCSQ NPLAELKCSV KSFEIDKGIY 3 00 

QTSNFRWPS GDWRFPNIT NLCPFGEVFN ATKFPSVYAW ERKKISNCVA DYSVLYNSTF 360 

FSTFKCYGVS ATKCiNDIiCFS I3VYADSFVVK GDDVRQIAPG QTGVIADYNY KLPDDFMGCV 42 0 

lAVaWTRNIDA TSTGNYNYKY RYLRHGKLRP FERDISNVPF SPDGKPCTPP ALNCYWPLND 480 

35 YGFYTTTGIG YQPYRWVIiS FELIiNAPATV CGPKLSTDLI KNQCVNFNFN GLTGTGVLTP 540 

SSKRFQPFQQ FGRDVSDFTD SVRDPKTSEI LDISPCSFCG VSVITPGTNA SSEVAVLYQD 600 

VNCTDVSTAI HADQLTPAWR lYSTGNNVFQ TQAGCTLIGAE HVDTSYECDI PIGAGICASY 660 

HTVSKLRSTS QKSIVAYTMS LGADSSXAYS NNTIAIPTNF SISITTEVMP VSMAKTSVDC 720 

NMYIOSDSTE CANLLLQYGS FCTQIiNRALS GIAAEQDRNT REVFAQVKQM YKTPTLKYFG 780 

40 GFNFSQXIiPD PLKPTKRSFI EDLLFNKVTIi ADAGFMKQYG ECLGDIIXARD LICAQKFWGL 840 

TVLPPLLTDD MIAAYTAAIiV SGTATAGWTF GAGAATiQIPF AMQMAYRFNG IGVTQNVLYE 900 

NQKQIANQPN KAISQIQESL TTTSTALGKli QDWNQNAQA LNTLVKQLSS NFGAISSVLN 960 

DILSRLDKVE AEVQIDRLIT GRLQSLQTYV TQQLIRAAEI RASANLAATK MSECVLGQSK 1020 

RVDFCGKGYH IiMSFPQAAPH GWFLHVTYV PSQERNFTTA PAICHEGKAY FPREGVFVFN 1080 

45 GTSWFITQRN FFSPQIITTD NTFVSGNCDV VIGIIilNTVY DPLQPELDSF KEELDKYPKN 1140 

HTSPDVDFGD ISGINASWN IQKEIDRLNE VAKNLNESLI DLQBLGKYBQ YIKWPWYVWL 12 00 

GPIAGLIAIV MVTILLCCMT SCCSCLKGAC SCGSCCKFDE DDSEPVLKGV KLHYT 1255 

<212> Type : PRT 
50 <211> Length : 1255 

SequenceName : SEQ ID 265 
SecfuenceDescription : 



to Sequence 



<213> OrganistnName : SARS coronavirus HSR 1 
<400> PreSequenceString : 

MFIFLLFLTL TSGSDLDRCT TFDDVQAPNY TQHTSSMRGV YYPDEIFRSD TLYLTQDLFL 60 

PFYSNVTGFH TINHTFGNPV IPFKDGIYFA ATEKSNWRG WVFGSTMNNK SQSVIIINNS 120 

60 TNWIRACNF ELCDNPFFAV SKPMGTQTHT MIFDNAFNCT FEYISDAFSL DVSEKSGNFK 180 

HLREFVFKNK DGFLYVYKGY QPIDWRDLP SGFNTLKPIF KLPLGINITN FRAILTAFSP 240 

AQDIWGTSAA AYFVGYLKPT TFMLKYDENG TITDAVDCSQ NPLAELKCSV KSFEIDKGIY 3 00 

QTSNFRWPS GDWRFPNIT NLCPPGEVFN ATKFPSVYAW ERKKISNCVA DYSVLYNSTF 360 

FSTFKCYGVS ATKLNDLCFS NVYADSFWK GDDVRQIAPG QTGVIADYNY KLPDDFMGCV 420 

65 LAWNTRNIDA TSTGNYNYKY RYLRHGKLRP FERDISNVPF SPDGKPCTPP ALNCYWPLND 480 

YGFYTTTGIG YQPYRVWLS FELLNAPATV CGPKLSTDLI KNQCVNFNFN GLTGTGVLTP 540 

SSKRFQPFQQ FGRDVSDFTD SVRDPKTSEI LDISPCSFGG VSVITPGTNA SSEVAVLYQD 600 
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VNCTDVSTAI HADQLTPAWR lYSTGNNVFQ 
HTVSLLRSTS QKSIVAYTMS LGADSSIAYS 
NMYICGDSTE CANLLLQYGS PCTQLNRALS 
GFNFSQILPD PIiBCPTKRSFI EDLLFNKVTL 
5 TVLPPLLTDD MIAAYTAALV SGTATAGWTF 
NQKQIANQFN KAISQIQESL TTTSTALGKL 
DILSRLDKVB AEVQIDRLIT GRLQSLQTYV 
RVDFCGKGYH LMSFPQAAPH GWFLHVTYV 
GTSWFITQRN FFSPQIITTD NTFVSGNC3DV 
10 HTSPDVDLGD ISGINASWN IQKEIDRLNE 
GFIAGLIAIV MVTIIiLCCMT SCCSCLKGAC 

<212> Type : PRT 
«211> Length ^ 125S 
15 SequenceName : SBQ ID 266 

SequenceDescription : 

Sequence 



20 <213> OrganistnName : SARS coronavirus ZJOl 
<400> PreSequenceString : 

MFIFLLFIiTL TSGSDLDRCT TFDDVQAPNY TQHTSSMRGV YYPDEIFRSD TLYLTQDLFL 60 

PFYSNVTGFH TINHTFGNPV IPFKDGIYFA ATEKSNWRG WVFGSTMNNK SQSVUINNS 120 

TiJWIRACMP EIiCDNPFFAV SKPMGTQTHT MIFDNAFNCT FEYISDAFSL DVSEKSGNFK 180 

25 HLREFVFKNK DGFLYVYKGY QPIDWRDLP SGFNTLKPIF KLPLGINITKT FRAILTAFS.P 240 

AQDIWGTSAA AYFVGYLKPT TFMLKYDEKTG TITDAVDCSQ NPIjAEIiKCSV KSFEIDKGIY 300 

QTSNFRWPS GDWRFPNIT NLCPFGEVFN ATKFPSVYAW ERKKISNCVA DYSVLYNSTF 360 

PSTFKCYGVS ATKLNDLCPS NVYADSPWK GDDVRQIAPG QTGVIADYNY KLPDDFMGCV 420 

LAWNTRNIDA TSTGNYNYKY RYLRHGKLRP FERDISNVPF SPDGKPCTPP ALNCYWPLND 480 

30 YGFYTTTGIG YQPYRVWLS FELLNAPATV CGPBCLSTDLI KNQCVNFNPN GLTGTGVLTP 540 

SSKRFQPFQQ FGRDVSDFTD SVRDPKTSEI LDISPCSFGG VSVITPGTNA SSEVAVIiYQD 600 

VNCTDVSTAI HADQLTPAWR lYSTGKNVPQ TQAGCLIGZVE HVDTSYECDI PIGAGICASY 660 

HTVSLLRSTS QKSIVAYTMS LOADSSIAYS NWTIAIPTNF SISITTEVMP VSMAKTSVDC 720 

NMYICGDSTE CANLLLQYGS FCTQLNRALS GIAAEQDRNT REVFAQVKQM YKTPTLKYFG 780 

35 GFNFSQILPD PLKPTKRSFI EDLLFNKVTL ADAGFMKQYG ECLGDINARD LICAQKFN6L 840 

TVLPPLLTDD MIAAYTAALV SGTATAGWTF GAGAALQIPF AMQMAYRFNG IGVTQNVLYS 900 

NQKQIANQFN KAISQIQESL TTTSTALGKL QDWNQNAQA LNTLVKQLSS NF6AISSVLN 960 

DILSRLDKVE AEVQIDRLIT GliLQSLQTYV TQQLIRAAEI RASANLAATK MSECVLGQSK 1020 

RVDFCGKGYH LMSFPQAAPH GWFLHVTYV PSQERNFTTA PAICHEGKAY FPREGVFVFN 1080 

40 GTSWFITQRN FFSPQIITTD NTFVSGNCDV VIGIINNTVY DPLQPELDSF KEELDKYFKN 1140 

HTSPDVDLGD ISGINASWN IQKEIDRLNE VAKNLNESLI DLQELGKYEQ YIKWPWYVWL 12 0 0 

GFIAGLIAIV MVTILLCCMT SCCSCLKGAC SCGSCCKFDE DDSEPVLKGV KLHYT 1255 

<212> Type : PRT 
45 <211> Length : 1255 

SequenceName : SEQ ID 267 
SequenceDescription : 

Sec3[uence 

50 

<213> Orga'nisraName : SARS coronavirus TWl 
<400> PreSequenceString : 

MFIFLLFLTL TSGSDLDRCT TFDDVQAPNY TQHTSSMRGV YYPDEIFRSD TLYLTQDLFL 60 

PFYSNVTGFH TINHTFGNPV IPFKDGIYFA ATEKSNWRG WVFGSTMNNK SQSVIIINNS 120 

55 TNWIRACNF ELCDNPPFAV SKPMGTQTHT MIFDNAFNCT FEYISDAFSL DVSEKSGNFK 180 

HLREFVFKNK DGFLYVYKGY QPIDWRDLP SGFNTLKPIP KLPLGINITN FRAILTAFSP 240 

AQDIWGTSAA AYPVGYLKPT TFMLKYDENG TITDAVDCSQ NPLAELKCSV ICSFEIDKGIY 3 00 

QTSNPRWPS GDWRFPNIT NLCPFGEVFN ATKFPSVYAW ERKKISNCVA DYSVLYNSTF 360 

PSTFKCYGVS ATKLNDLCFS NVYADSPWK GDDVRQIAPG QTGVIADYNY KLPDDFMGCV 420 

60 LAWNTRNIDA TSTGNYNYKY RYLRHGKLRP FERDISNVPF SPDGKPCTPP ALNCYWPLND 480 

YGFYTTTGIG YQPYRVWLS FELLNAPATV CGPKLSTDLI KNQCVNFNFN GLTGTGVLTP 540 

SSKRFQPFQQ FGRDVSDFTD SVRDPKTSEI LDISPCSFGG VSVITPGTNA SSEVAVLYQD 600 

VNCTDVSTAI HADQLTPAWR lYSTGNNVFQ TQAGCLIGAE HVDTSYECDI PIGAGICASY 660 

HTVSLLRSTS QKSIVAYTMS LGADSSIAYS NNTIAIPTNF SISITTEVMP VSMAKTSVDC 720 

65 NMYICGDSTE CANLLLQYGS FCTQLNRALS GIAAEQDRNT REVFAQVKQM YKTPTLKYFG 780 

GFNFSQILPD PLKPTKRSFI EDLLFNKVTL ADAGFMKQYG ECLGDINARD LICAQKFNGL 840 

TVLPPLLTDD MIAAYTAALV SGTATAGWTF GAGAALQIPF AMQMAYRFNG IGVTQNVLYE 900 



TQAGCLIGAE HVDTSYECDI PIGAGICASY 660 

NNTIAIPTNF SISITTEVMP VSMAKTSVDC 720 

GIAAEQDRNT REVFAQVKQM YKTPTLKYFG 780 

ADAGFMKQYG ECLGDINARD LICAQKFNGL 840 

GAGAALQIPF AMQMAYRFNG IGVTQNVLYE 900 

QDWNQNAQA LNTLVKQLSS NFGAISSVLN 960 

TQQLIRAAEI RASANLAATK MSECVLGQSK 1020 

PSQERNFTTA PAICHEGKAY FPREGVFVFN 1080 

VIGIINNTVY DPLQPELDSF KEELDKYFKN 1140 

VAKNLNESLI DLQELGKYEQ YIKWPWYVWL 1200 

SCGSCCKFDE DDSEPVLKGV KLHYT ' 1255 
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NQKQIANQFN KAISQIQESL TTTSTALGKL QDWNQNAQA LNTLVKQLSS NFGAISSVLN 960 

DIIiSRLDKVE AEVQIDRLIT GRLQSLQTYV TQQLIRAAEI RASANIiAATK MSECVLGQSK 1020 

RVDFCGKGYH LMSFPQAAPH GWPLHVTYV PSQBRNFTTA PAICHEGKAY FPREGVFVFN 1080 

GTSWFITQRN FFSPQIITTD NTFVSGNCDV VIGII3SINTVY DPLQPELDSF KEELDKYFKN 1140 

5 HTSPDVDLGD ISGIMASWN IQKEIDRLNE VAKNUSTESLI DLQELGKYEQ YIKWPWYVPWli 1200 

GFIAGLIAIV MVTILIiCCMT SCCSCLKGAC SCGSCCKFDE DDSEPVLKGV KLHYT 1255 



<212> Type : PRT 
<211> Length : 1255 
10 SequenceName : SEQ ID 268 

SequenceDe script ion : 

Seqxience 

15 <213> OrganismName : SARS coronavirus CUHK-SulO 
<400> PreSequenceString : 

, MFIFLLFLTL TSGSDLDRCT TFDDVQAPNY TQHTSSMRGV YYPDEIFRSD TLYLTQDLPL 60 

PFYSNVTGFH TINHTFGNPV IPFKDGIYFA ATEKSNWRG WVFGSTMNNK SQSVIIINNS 120 

TNWIRACNF ELCDNPFFAV SKPMGTQTHT MIFDNAFNCT FEYISDAFSL DVSEKSGNFK 180 

20 HLREFVFKKTK DGFIiYVYKGY QPIDWRDLP SGFNTLKPIF KLPLGINITN FRAILTAFSP 240 

AQDIWGTSAA AYFVGYLKPT TFMLKYDENG TITDAVDCSQ NPIiAELKCSV KSFEIDKGIY 300 

QTSNFRWPS GDWRFPNIT NLCPFGEVPN ATKFPSVYAW ERKKISNCVA DYSVTiYNSTF 360 

FSTFKCYGVS ATKLNDLCFS NVYADSFWK GDDVRQIAPG QTGVIADYNY KLPDDFMGCV 420 

LAWNTRKTIDA TSTGNYNYKY RYLRHGKQRP FERDISNVPP SPDGKPCTPP ALNCYWPLND 480 

25 YGFYTTTGIG YQPYRVWLS FELLNAPATV CGPKLSTDLI KNQCVNFNFN GLTGTGVLTP 540 

SSKRFQPFQQ FGRDVSDFTD SVRDPKTSEI LDISPGSFGG VSVITPGTNA SSEVAVLYQD 600 

VNCTDVSTAI HADQLTPAWR lYSTGNNVFQ TQAGCLIGAE HVDTSYECDI PIQAGICASY 660 

HTVSLLRSTS QKSIVAYTMS LGADSSIAYS NNTIAIPTNF SISITTEVMP VSMAKTSVDC 720 

NMYICGDSTE CANLLLQYGS FCTQLNRALS GIAAEQDRNT REVFAQVKQM YKTPTLKYFG 780 

30 GFNFSQILPD PLKPTKRSFI EDLLPNKVTL ADAGFMKQYG ECLODINARD LICAQKFNGL 840 

TVIiPPIiLTDD MIAAYTAALV SGTATAGWTF GAGAALQIPF AMQMAYRFKG IGVTQWVLYE 900 

NQKQIANQFN KAISQIQESL TTTSTALGKL QDWNQNAQA LNTLVKQLSS NFGAISSVLN 960 

DILSRLDKVE AEVQIDRLIT GRLQSLQTYV TQQLIRl^I RASANLAATK MSECVLGQSK 1020 ' 

RVDFCGKGYH LMSFPQAAPH GWFLHVTYV PSQERNFTTA PAICHEGKAY FPREGVFVFN 1080 

35 GTSWFITQRN FFSPQIITTD NTFVSGNCDV VIGIINNTVY DPLQPELDSF KEELDKYFKN 1140 

HTSPDVDLGD ISGINASWN IQKEIDRLNE VAKNLNESLI DLQELGKYEQ YIKWPWYVWL 1200 

GFIAGLIAIV MVTILLCCMT SCCSCLKGAC SCGSCCKFDE DDSEPVLKGV KLHYT 1255 



<212> Type : PRT 
40 <211> Length : 1255 

SequenceName r SEQ ID 269 
SequenceDescription : 

Sequence 
45 

<213> OrganismName : SARS coronavirus Urbani 
<400> PreSequenceString : 

MFIFLLFLTL TSGSDLDRCT TFDDVQAPNY TQHTSSMRGV YYPDEIFRSD TLYLTQDLFL 60 

PFYSNVTGFH TINHTFGNPV IPFKDGIYFA ATEKSNWRG WVFGSTMNNK SQSVIIINNS 120 

50 TNWIRACNF ELCDNPFFAV SKPMGTQTHT MIFDNAFNCT FEYISDAFSL DVSEKSGNFK 180 

HLREFVFKNK DGFLYVYKGY QPIDWRDLP SGFNTLKPIF KLPLGINITN FRAILTAFSP 240 

AQDIWGTSAA AYFVGYLKPT TFMLKYDENG TITDAVDCSQ NPLAELKCSV KSFEIDKGIY 3 00 

QTSNFRWPS GDWRFPNIT NLCPFGEVFN ATKFPSVYAW ERKKISNCVA DYSVLYNSTF 3 60 

FSTFKCYGVS ATKLNDLCFS NVYADSFWK GDDVRQIAPG QTGVIADYNY KLPDDFMGCV 420 

55 LAWNTRNIDA TSTGNYNYKY RYLRHGKLRP FERDISNVPF SPDGKPCTPP ALNCYWPLND 480 

YGFYTTTGIG YQPYRVWLS FELLNAPATV CGPKLSTDLI KNQCVOTNFN GLTGTGVLTP 540 

SSKRFQPFQQ FGRDVSDFTD SVRDPKTSEI LDISPCSFGG VSVITPGTNA SSEVAVLYQD 600 

VNCTDVSTAI HADQLTPAWR lYSTGNNVFQ TQAGCLIGAE HVDTSYECDI PIGAGICASY 660 

HTVSLLRSTS QKSIVAYTMS LGADSSIAYS NNTIAIPTNF SISITTEVMP VSMAKTSVDC 720 

60 NMYICGDSTE CANLLLQYGS FCTQLNRALS GIAAEQDRNT REVFAQVKQM YKTPTLKYFG 780 

GFNFSQILPD PLKPTKRSFI EDLLFNKVTL ADAGFMKQYG ECLGDINARD LICAQKFNGL 840 

TVLPPLLTDD MIAAYTAALV SGTATAGWTF GAGAALQIPF AMQMAYRFNG IGVTQNVLYE 900 

NQKQIANQFN KAISQIQESL TTTSTALGKL QDWNQNAQA LNTLVKQLSS NFGAISSVLN 960 

DILSRLDKVE AEVQIDRLIT GRLQSLQTYV TQQLIRAAEI RASANLAATK MSECVLGQSK 1020 

65 RVDFCGKGYH LMSFPQAAPH GWFLHVTYV PSQERNFTTA PAICHEGKAY FPREGVFVFN 1080 

GTSWFITQRN FFSPQIITTD NTFVSGNCDV VIGIINNTVY DPLQPELDSF KEELDKYFKN 1140 

HTSPDVDLGD ISGINASWN IQKEIDRLNE VAKNLNESLI DLQELGKYEQ YIKWPWYVWL 1200 
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GPIAGLIAIV MVTILLCCMT SCCSCLKGAC SCGSCCKFDE DDSEPVLKGV KLHYT 1255 

<212> Type : PRT 
<211> Length : 1255 
5 SequenceName : SEQ ID 270 

SequenceDescription : 

Sequence 



10 <213> Organ! smKrame : SARS corqnavirus 
<400> PreSequenceString : 

MFIFIiLPLTL TSGSDLDRCT TFDDVQAPNY TQHTSSMRGV YYPDEIFRSD TLYLTQDLFL 60 

PFYSNVTGFH TINHTFGNPV IPFKDGIYFA ATEKSNWRG WVFGSTMNNK SQSVIIINlSrS 12 0 

TNWIRAGNF ELCDNPFFAV SKPMGTQTHT MIFDNAFNCT FEYISDAPSL DVSEKSGMFK - 18.0 

15 HLREFVFKNK DGFLYVYKGY QPIDWRDLiP SGFNTLKPIF KLPLGINITN FRAIIiTAFSP 240 

AQDIWGTSAA AYFVGYLKPT TFMLKYDENG TITDAVDCSQ NPLAELKCSV KSFEIDKGIY 3 00 

QTSNFRWPS GDWRFPNIT NLCPFGEVFN ATKFPSVYAW ERKKISNCVA DYSVLYNSTF 360 

FSTFKCYGVS ATKLNDLCFS NVYADSFWK GDDVRQIAPG QTGVIADYNY KLPDDFMGCV 420 

LAWNTRNIDA TSTGNYNYKY RYLRHGKLRP FERDISNVPF SPDGKPCTPP ALNCYWPLND 480 

20 YGFYTTTGIG YQPYRVWLS FELLNAPATV CGPKLSTDLI KNQCVNFNFN GLTGTGVLTP 540 

SSKRFQPFQQ FGRDVSDFTD SVRDPKTSEI LDISPCAFGG VSVITPGTNA SSEVAVLYQD 600 

VNCTDVSTAI HADQLTPAWR lYSTGNNVFQ TQAGCLIGAE HVDTSYECDI PIGAGICASY 660 

HTVSLLRSTS QKSIVAYTMS LGADSSIAYS NNTIAIPTNF SISITTEVMP VSMAKTSVDC 720 

NMYICGDSTE CANLIiLQYGS FCTQLNRALS GIAAEQDRNT REVFAQVKQM YKTPTLKYFG 780 

25 GFNFSQILPD PLKPTKRSFI EDLLFNKVTL ADAGFMKQYG ECLGDINARD LICAQKFNGL 84 0 

TVLPPLLTDD MIAAYTAALV SGTATAGWTF GAGAALQIPF AMQMAYRFNG IGVTQNVLYE 90 0 

NQKQIANQFN KAISQIQESL TTTSTALGKL QDWNQNAQA LNTLVKQLSS NPGAISSVLN 360 

DILSRLDKVE AEVQIDRLIT GRLQSLQTYV TQQLIRAAEI RASANLAATK MSECVLGQSK 102 0 

RVDFCGKGYH LMSFPQAAPH GWFLHVTYV PSQERNFTTA PAICHEGKAY FPREGVFVFN 1080 

30 GTSWFITQRN FFSPQIITTD NTFVSGNCDV VIGIINNTVY DPLQPELDSF KEELDKYFKN 1140 

HTSPDVDLGD ISGIWASYVN IQKEIDRLNE VAKNLNESLI DLQELGKYEQ YIKWPWYVWL 1200 

GPIAGLIAIV MVTILLCCMT SCCSCLKGAC SCGSCCKFDE DDSEPVLKGV KLHYT 1255 

<212> Type i PRT 
35 <211> Length : 1255 

SequenceName : SEQ ID 271 
SequenceDescription : 

Sequence 
40 

<213> OrganismMame : SARS coronavirus Tor2 
<400> PreSequenceString : 

MFIFLLFLTL TSGSDLDRCT TFDDVQAPNY TQHTSSMRGV YYPDEIFRSD TLYLTQDLFL 60 

PFYSNVTGFH TINHTFGNPV IPFKDGIYFA ATEKSNWRG WVFGSTMNNK SQSVIIINNS 12 0 

45 TNWIRACNF ELCDNPFFAV SKPMGTQTHT MIFDNAFNCT FEYISDAPSL DVSEKSGMFK 180 

HLREFVFKNK DGFLYVYKGY QPIDWRDLP SGFNTLKPIF KLPLGINITN FRAILTAFSP 240 

AQDIWGTSAA AYFVGYLKPT TFMLKYDENG TITDAVDCSQ NPLAELKCSV KSFEIDKGIY 30 0 

QTSNFRWPS GDWRFPNIT NLCPFGEVFN ATKFPSVYAW ERKKISNCVA DYSVLYNSTP 360 

FSTFKCYGVS ATKLNDLCFS NVYADSFWK GDDVRQIAPG QTGVIADYNY KLPDDFMGCV 42 0 

50 LAWNTRNIDA TSTGNYNYKY RYLRHGKLRP FERDISNVPF SPDGKPCTPP ALNCYTaFPLND 480 

YGFYTTTGIG YQPYRVWLS FELLNAPATV CGPKLSTDLI KNQCVNFNFN GLTGTGVLTP 540 

SSKRFQPFQQ FGRDVSDFTD SVRDPKTSEI LDISPCAFGG VSVITPGTNA SSEVAVLYQD 600 

VNCTDVSTAI HADQLTPAWR lYSTGNNVFQ TQAGCLIGAE HVDTSYECDI PIGAGICASY 660 

HTVSLLRSTS QKSIVAYTMS LGADSSIAYS NNTIAIPTNF SISITTEVMP VSMAKTSVDC 72 0 

55 NMYICGDSTE CANLLLQYGS FCTQLNRALS GIAAEQDRNT REVFAQVKQM YKTPTLKYFG 780 
GFNFSQILPD PLKPTKRSFI EDLLFNKVTL ADAGFMKQYG ECLGDINARD LICAQKFNGL ' 84 0 

TVLPPLLTDD MIAAYTAALV SGTATAGWTF GAGAALQIPF AMQMAYRFNG IGVTQNVLYE 900 

NQKQIANQFN KAISQIQESL TTTSTALGKL QDWNQNAQA LNTLVKQLSS NFGAISSVLN 960 

DILSRLDKVE AEVQIDRLIT GRLQSLQTYV TQQLIRAAEI RASANLAATK MSECVLGQSK 1020 

60 RVDFCGKGYH LMSFPQAAPH GWFLHVTYV PSQERNFTTA PAICHEGKAY FPREGVFVFN 108 0 

GTSWFITQRN FFSPQIITTD NTFVSGNCDV VIGIINNTVY DPLQPELDSF KEELDKYFKN 1140 

HTSPDVDLGD ISGINASWN IQKEIDRLNE VAKNLNESLI DLQELGKYEQ YIKWPWYVWL 1200 

GPIAGLIAIV MVTILLCCMT SCCSCLKGAC SCGSCCKFDE DDSEPVLKGV KLHYT 1255 

65 <212> Type : PRT 

<211> Length : 1255 

SequenceName : SEQ ID 272 
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SequenceDescription : 
Sequence 

5 <213> OrganismName : SARS coronavirus GDOl 
<400> PreSequenceString : 

MFIFLIiFLTL TSGSDLDRCT TFDDVQAPNY TQHTSSMRGV YYPDEIFRSD TLYLTQDLFL 60 

PFYSNVTGFH TINHTFDNPV IPFKDGIYFA ATEKSNWRG WVFGSTiNlNlsrK SQSVIIINNS 12 0 

TNWIRACMF ELCDNPFFAV SKPMGTQTHT MIFDNAFNCT FEYISDAFSL DVSEKSGNFK 180 

10 HLREFVFKNK DGFLYVYKGY QPIDWRDLP SGFNTLKPIF KLPLGINITN FRAXLTAFLP 240 

AQDTWGTSAA AYFVGYLKPT TFMLKYDENG TITDAVDCSQ NPLAELKCSV KSFEIDKGIY 3 00 

QTSNFRWPS RDWRFPiSriT NLCPFGEVFM ATKFPSVYAW ERKRISNCVA DYSVLYNSTF 3 60 

FSTFKCYGVS ATKLNDLCFS NVYADSFWK GDDVRQIAPG QTGVIADYNY KIiPDDFMGCV 420 

LAWNTRNIDA TSTGNYNYKY RYLRHGKLRP FERDISNVPF SPDGKPCTPP ALNCYWPLND 480 . 

15 YGFYTTTGIG YQPYRVWLS YELLNAPATV CGPKLSTDLI KNQCVNFNFN GLTGTGVLTP 540 

SSKRFQPFQQ FGRDVSDFTD SVRDPKTSEI LDISPCSFGG VSVITPGTNA SSEVAVLYQD 60 0 

VNCTDVSTAI HADQLTPAWR lYSTGNNVFQ TQAGCLIGAE HVDTSYECDI PIGAGICASY 660 

HTVSLLRSTS QKSIVAYTMS LGADSSIAYS NNTIAIPTNF SISITTEVMP VSMAKTSVDC 720 

NMYICGDSTE CANLLLQYGS FCTQLNRALS GIAAEQDRNT REVFAQVKQM YKTPTLKDFG 780 

20 GFNFSQILPD PLKSTKRSFX EDLLFNKYTL ADAGFMKQYG ECLGDINARD LICAQKFNGIi 840 

TVLPPLLTDD MIAAYTAALV SGTATAGWTF GAGAALQIPF AMQMAYRFNG IGVTQNVLYE 900 

NQKQIANQFN KAISQIQESL TTTSTALGKL QDWNQNAQA LNTLVKQL.SS NFGAISSVLN 960 

DILSRLDKVE AEVQIDRLIT GRLQSLQTYV TQQLIRAAEI RASANLAATK MSECVLGQSK 1020 

RVDFCGKGYH LMSFPQAAPH GWFLHVTYV PSQERNFTTA PAICHEGKAY FPREGVFVFN 10 80 

25 GTSWFITQRN FFSPQIITTD NTFVSGNCDV VIGIINNTVY DPLQPELDSF KEELDKYFIOSf 1140 

HTSPDVDIiGD ISGINASWN IQKEIDRLNE VAKNIiNESLI DLQEIiGKYEQ YIKWPWYVWL 1200 

6FIAGLIAIV MVTIIiLCCMT SCCSCLKGAC SCGSCCKFDE DDSEPVLKGV KLHYT 1255 

<212> Type : PRT 
30 <211> Length : 1255 

SequenceName : SEQ ID 273 
SequenceDescription : 

Sequence 
35 

<213> OrganistnlSrame : SARS coronavirus CUHK-Wl 
<400> PreSequenceString : 

MFIFLLFIiTL TSGSDLDRCT TFDDVQAPNY TQHTSSMRGV YYPDEIFRSD TLYLTQDLFL 60 

PFYSNVTGFH TINHTFDNPV IPFKDGIYFA ATEKSNWRG WVFGSTMNNK SQSVIIINNS 12 0 

40 TNWIRACNF ELCDNPFFAV SKPMGTQTHT MIFDNAFNCT FEYISDAFSL DVSEKSGNFK 18 0 

HLREFVFKNK DGFLYVYKGY QPIDWRDLP SGFNTLKPIF KLPLGINITN FRAILTAFSP 24 0 

AQDTWGTSAA AYFVGYLKPT TFMLKYDENG TITDAVDCSQ NPLAELKCSV KSFEIDKGIY 3 00 

QTSNFRWPS GDWRFPNIT NLCPFGEVFN ATKFPSVYAW ERKKISNCVA DYSVLYNSTF 360 

FSTFKCYGVS ATKLNDLCFS NVYADSFWK GDDVRQIAPG QTGVIADYNY KLPDDFMGCV 420 

45 LAWNTRNIDA TSTGNYNYKY RYLRHGKLRP FERDISNVPF SPDGKPCTPP ALNCYWPLND 48 0 

YGFYTTTGIG YQPYRVWLS FELLNAPATV CGPKLSTDLI KNQCVNFNFN GLTGTGVLTP 54 0 

SSKRFQPFQQ FGRDVSDFTD SVRDPKTSEI LDISPCSFGG VSVITPGTNA SSEVAVLYQD 600 

VNCTDVSTAI HADQLTPAWR lYSTGNNVFQ TQAGCLIGAE HVDTSYECDI PIGAGICASY 660 

HTVSLLRSTS QKSIVAYTMS LGADSSIAYS NNTIAIPTNF SISITTEVMP VSMAKTSVDC 72 0 

50 NMYICGDSTE CANLLLQYGS FCTQLNRALS GIAAEQDRNT REVFAQVKQM YKTPTLKYFG 780 

GFNFSQILPD PLKPTKRSFI EDLLFNKVTL ADAGFMKQYG ECLGDINARD LICAQKFNGL 840 

TVLPPLLTDD MIAAYTAALV SGTATAGWTF GAGAALQIPF AMQMAYRFNG IGVTQNVLYE 900 

NQKQIANQFN KAISQIQESL TTTSTALGKL QDWNQNAQA LNTLVKQLSS NFGAISSVLN 960 

DILSRLDKVE AEVQIDRLIT GRLQSLQTYV TQQLIRAAEI RASANLAATK MSECVLGQSK 1020 

55 RVDFCGKGYH LMSFPQAAPH GWFLHVTYV PSQERNFTTA PAICHEGKAY FPREGVFVFN 108 0 

GTSWFITQRN FFSPQIITTD NTFVSGNCDV VIGIINNTVY DPLQPELDSF KEELDKYFKN 1140 

HTSPDVDLGD ISGINASWN IQKEIDRLNE VAKNLNESLI DLQELGKYEQ YIKWPWYVWL 1200 

GFIAGLIAIV MVTILLCCMT SCCSCLKGAC SCGSCCKFDE DDSEPVLKGV KLHYT 1255 

60 <212> Type : PRT 

<211> Length : 1255 

SequenceName : SEQ ID 274 
SequenceDescription : 

65 Sequence 



<213> OrganismName : SARS coronavirus BJOl 
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<400> PreSeq[ueiiceString : 
MPIPLLPIiTL TSGSDLDRCT TFDDVQAPNY 
PFYSNVTGFH TINHTFDNPV IPFKDGIYFA 
TNWIRACNF ELCDNPFFAV SKPMGTQTHT 
5 HIiREFVPKNK DGFLYVYKGY QPIDWRDLP 
AQDTWGTSAA AYPVGYLKPT TFMLKYDENG 
QTSNFRWPS GDWRFPNIT NLCPFGEVFN 
FSTFKCYGVS ATKLNDLCFS NVYADSFWK 
LAWNTRNIDA TSTGNYNYKY RYLRHGKLRP 

10 YGFYTTTGIG YQPYRVWLS FELLNAPATV 
SSKRFQPFQQ PGRDVSDFTD SVRDPKTSEI 
VNCTDVSTAI HADQLTPAWR lYSTGNNVFQ 
HTVSLLRSTS QKSIVAYTMS L6ADSSIAYS 
. NMYICGDSTE CANItLIiQYGS FCTQLNRALS 

15 GFNFSQILPD PLKPTKRSFI EDLLFNKVTL 
TVLPPLLTDD MIAAYTAALV SGTATAGWTF 
NQKQIANQFN KAISQIQESL TTTSTALGKL 
DILSRLDKVE AEVQIDRLIT GRLQSLQTYV 
RVDFCGKGYH LMSFPQAAPH GWFLHVTYV 

20 GTSWFITQRN FFSPQIITTD NTFVSGNCDV 
HTSPDVDLGD ISGINASWN IQKEIDRLNE 
GFIAGLIAIV MVTILLCCMT SCCSCLKGAC 



TQHTSSMRGV YYPDEIFRSD TLYLTQDLFL 60 

ATEKSNWRG WVFGSTMNNK SQSVIIINNS 120 

MIFDNAFNCT FEYISDAFSL DVSEKSGNFK 180 

SGFNTLKPIF KLPLGINITN FRAILTAFSP 240 

TITDAVDCSQ NPLAELKCSV KSFEIDKGIY 3 00 

ATKFPSVYAW ERKKISNCVA DYSVLYNSTF 3 60 

GDDVRQIAPG QTGVIADYNY KLPDDFMGCV 420 

FERDISNVPF SPDGKPCTPP ALNCYWPLND 480 

CGPKLSTDLI KNQCVNFNFN GLTGTGVLTP 540 

IiDISPCSFGG VSVITPGTNA SSEVAVLYQD 600 

TQAGCIilGAE HVDTSYECDI PIGAGICASY 660 

NNTIAIPTNF SISITTEVMP VSMAKTSVDC 720 

GIAAEQDRNT . REVFAQVKQM YKT5>TLKYFG 7 80 

ADAGFMKQYG ECIiGDINARD LICAQKFNGL 840 

GAGAALQIPF AMQMAYRFNG IGVTQNVLYE 9 00 

QDWNQNAQA LNTLVKQLSS NFGAISSVLN 960 

TQQLIRAAEI RASANIiAATK MSECVLGQSK 102 0 

PSQERNFTTA PAICHEGKAY FPREGVFVFN 1080 

VIGIINNTVY DPLQPELDSF KEELDKYFKN 1140 

VAKNLNESLI DLQELGKYEQ YIKWPWYVWL 1200 

SCGSCCKFDE DDSEPVLKGV KLHYT 1255 



<212> Type : PRT 
25 <211> Length : 1255 

SequenceName : SEQ ID 275 
SequenceDescription : 



Sequence 

30 

<213> OrganisttiUTame : SARS coronavirus 
<400> PreSequenceString : 

SGFRKMAFPS GKVEGCMVQV TCGTTTIjNGIi WLDDTVYCPR HVICTAEDML NPISTYEDLLIR 60 
KSKHSFLVQA GNVQLRVIGH SMQNCLLRLK VDTSNPKTPK YKFVRIQPGQ TPSVLACYCTG 120 

35 SPSGVYQCAM RPNHTIKGSF LNGSCGSVGF NIDYDCVSFC YMHHMELPTG VHAGTDLEGK 180 
FYGPFVDRQT AQAAGTDTTI TLNVLAWLYA AVINGDRWFIi NRFTTTLNDF NLVAMKYNYE 240 
PItTQDHVDIIi GPLSAQTGIA VUDMCAAIiKE LLONOOTGRT ILGSTILEDE FTPFDfWRQC 300 
SGVTFQ 3 06 

<:212> Type r PRT 

40 <211> Length. : 306 

SequenceName : SEQ ID 276 
SequenceDescription : 



Sequence 
45 

<213> OrganismName : SARS coronavirus 
<400> PreSequenceString : 

AIASEFSSLP SYAAYATAQE AYEQAVANGD SEWLKKLKK SLNVAKSEFD RDAAMQRKLE 60 

KMADQAMTQM YKQARSEDKR AKVTSAMQTM LFTMLRKLDN DALNNIINNA RDGCVPLNII 120 
50 PLTTAAKLMV WPDYGTYKN TCDGNTFTYA SALWEIQQW DADSKIVQLS BINMDNSPNL 180 

AWPLIVTALR ANSAVKLQ 198 

<212> Type : PRT 

<211> Length : 198 

SequenceName : SEQ ID 277 
55 SequenceDescription : 



Sequence 



<213> OrganismName : SARS coronavirus 
60 <400> PreSequenceString : 

AGNATEVPAN STVLSFCAFA VDPAKAYKDY LASGGQPITN CVKMLCTHTG TGQAITVTPE 60 
ANMDQESFGG ASCCLYCRCH IDHPNPKGFC DLKGKYVQIP TTCANDPVGF TLRNTVCTVC 120 
GMWKGYGCSC DQLREPLMQ 139 
<212> Type : PRT 
65 <211> Length : 139 

SequenceName : SEQ ID 278 
SequenceDescription : 



wo 2005/076010 



87/341 



PCT/IN2005/000037 



Sequence 



<213> OrganismName : SARS coronavirus 
5 <400> PreSequenceString : 

NNELSPVALR QMSCAAGTTQ TACTDDNALA YYNNSKGGRP VLALLSDHQD LKWARFPKSD 60 
GTGTIYTEIiE PPCRFVTDTP KGPKVKYLYF IKGLNNLNRG MVXiGSLAATV RLQ 113 

<212> Type : PRT 
10 <211> Length : 113 

SequenceNarae : SEQ ID 279 
SequenceDescription : 



15 Sequence 



<213> OrganisraName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MNKIFKVIWN PATGNYTVTS ETAKSRGKKS GRSKLLISAL VAGGMIiSSFG ALANAGNDNG 60 

20 QGVDYGSGSA GDGWVAIGKG AKANTFMNTS GSSTAVGYDA lAEGQYSSAI GSKTHAIGGA 120 
SMAFGVSAIS EGDRSIALGA SSYSLGQYSM ALGRYSKALG KLSIAMGDSS KAEGANAIAL 18 0 

GNATKATEIM SIALGDTANA SKAYSMALGA SSVASEENAI AIGAETEAAE NATAIGNNAK 240 
AKGTNSMAMG FGSLADKVNT lALGNGSQAL ADNAIAIGQG NKADGVDAIA LGNGSQSRGL 3 00 

NTIALGTASN ATGDKSLALG SNSSAKGINS VALGADSIAD LDNTVSVGNS SLKRKIVNVK 3 60 

25 NGAIKSDSYD AINGSQIiYAI SDSVAKRLGG GAAVDVDDGT VTAPTYNLKN GSKNNVGAAL 420 
AVLDENTLQW DQTKGKYSAA HGTSSPTASV ITDVADGTIS ASSKDAVNGS QLKATNDDVE 480 
AMTANIATNT SNIATNTANI ATNTTNITNL TDSVGDLQAD ADIiWNETKKA FSAAHGQDTT 540 
SKITNVKDAD LTADSTDAVN GSQLKTTNDA VATNTTNIAN NTSNIATNTT NISNLTETVT 600 
NLGEDALKWD KDNGVFTAAH GTETTSKITN VKDGDLTTGS TDAVNGSQLK TTHDAVATNT 660 

30 TNIATNTTNI SNLTETVTNIi GEDALKWDKD NGVFTAAHGN NTASKITNIL DGTVTATSSD 720 
AINGSQLYDL SSNIATYFGG NASVNTDGVF TGPTYKIGET NYYNVGDADA AINSSFSTSL 7 80 

GDALLWDATA GiCFSAKHGTU GDASVITDVA DGEISDSSSD AVNGSQLHGV SS YWDALGG 840 
GAEVNADGTI TAPTYTIAKTA DYDNVGDALN AIDTTLDDAL LWDADAGENG AFSAAHGKDK 9 00 

TASVITNVAN GAISAASSDA XNGSQIiYTTN KYIADALGGD AEVNADGTIT APTYTIANAE 960 

35 YWNVGDALDA IiDDNALIiWDE TANGGAGAYN ASHDGKASII TNVANGSISE DSTDAWGSQ 1020 

LNATNMMIEQ NTQXXNQLAG NTDATYIQEN 6AGINYVRTN DDGLAFNDAS AQGVGATAIG 10 80 

YNSVAKGDSS VAIGQGSYSD VDTGIALGSS SVSSRVIAKG SRDTSITENG WIGYDTTDG 1140 

ELLGALSIGD DGKYRQXINV ADGSEAHDAV TVRQLQNAIG AVATTPTKYF HANSTEEDSIi 1200 

AVGTDSIAMG AKTIWGDKG XGIGYGAYVD ANALNGIAIG SNAQVXHVNS lAIGNGSTTT 1260 

40 RGAQTNYTAY NMDAPQNSVG EFSVGSADGQ RQITNVAAGS ADTDAVNVGQ LKVTDAQVSQ 13 20 

NTQSXTNLDIT RVTNLDSRVT NIENGXGDXV TTGSTKYFKT NTDGVDASAQ GKDSVAIGSG 13 80 

SIAAADNSVA LGTGSVATEE NTXSVGSSTN QRRXTNVAAG KNATDAVNVA QLKSSEAGGV 1440 

RYDTKADGSI DYSNITLGGG NGGTTRISNV SAGVNNWDW NYAQLKQSVQ ETKQYTDQRM 1500 

VEMDNKLSKT ESKLSGGIAS AMAMTGLPQA YTPGASMASX GGGTYNGESA VALGVSMVSA 1560 

45 NGRWVYKLQG STNSQGEYSA ALGAGIQW 1588 



<212> Type : PRT 

<211> Length : 1588 

SequenceName : SEQ ID 280 
Sec[uenceDescription : 

50 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

55 MPASAVGALG EASYTVTANV TDSAGNSNSA SHNVQVNTAL PGVTINPVAT DDIINAAESG 60 

NAQTISGQVT GAAAGDTVTV TLGGKTYTAT VQGNLSWSVD VPAADIQAIG NGNLTWASV 12 0 

TNGVGNTGSG SRDITIDANL PGLRVDTVAG DOWNS lEHA QALVITGSSS GLAAGAALTV 180 

VXNTVTYAAT VLADGTWSVG VPAADVSNWP AGTVNITVSG TNTAGTTSTX THPVTVDLAA 240 

VAISINTVSG DDVINAAEKG ADLTLSGSTS GVEVGQTVTV TFGGKTYTAT VAGDGSWTTT 3 00 

60 VPAADLSVLR DGDATVQASV STINGNTASA THAYSVDATA PTLAXNTIAT DDILNAAEAG 360 

NPLTISGSST AEAGQTVTVT LNGVTYSGSV QADGSWSVSL PTADLSNLTA SQYTVSASVS 420 

DKA6NPASAN H6DAVDLTVP VLTINTVSGD DIINAAEHGQ ALVISGSSTG GEAGDVITVT 480 

LNSKTYTTML DASGNWSVGV PAADVTALGS GPQTITAAIT DAAGNSDDAS RTVTVNLAAP 540 

TIGINTIATD DVIKATEKGA DLQITGTSNQ PAGTTITVTL NGQNYTATTD SNGNWSATVP 60 0 

65 ASAVSALGEA NYTVTANVTD TAGNSNSASH NVLVNSALPA VTIKfAVATDD IINAAESGNA 660 

QTISGQVTQA AQGDTVTVTL GGNTYTATVQ SNLSWSVDVP AADIQALGNG DLTVNASVTN 720 

GVGNTGSGSR DITIDANLPG LRVDTVAGDD VINSXEHNQA LVITGSSSGL TAGTALTVEX 780 
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NWVTYGATVL ADGTWSIiGVP AVDVSNWPAG 
ITINTLSGDD VINAVEKGET LWSGSTSGV 
PADLAALPDG AGNVQASVSN INGNSAQADR 
ITISGTTTAQ AGQTLTVTLN NNTYQTTVLA 
5 AGNPASADHA LWDITAPDL TINTVAGDDI 
GKNYTTTLDA SGNWSVGIPA ADVTALATGS 
TINTVSGDDI INAAEIWAQ TISGQVTGTA 
ANVIiQALGNG ELTISASLTN SA3SINT6TATH 
LVITGSSSGL AAGAALTWI NSVTYGATVIi 

10 TAGTTTSISH PVTVDLAAVA ITINTLSTDD 
GGKSYTTTVA ADNTWGLTIP AVDVATLPDG 
VTINTIATDD ILNAAEAGSA LTIS6TSTAE 
GDLASLTASS YTVNASVSDK ARNSASATHN 
IISGSATGAT TGNTVSVTIG TTTYTTVLDA 

15 AGNSGTASHT VTVALGAPVIi AINTIAVDDI 
QNYTTTADAS GNWSVTVPAS RVSALGEATY 
INWATDDII NAAEAGVEQT ISGQVTGAAA 
AIiQELGMGEL TISASVTNSV GNTGNGTREI 
ITGSSSGLAA GSNVTLTING QTYVAAVLAD 

20 GNPVSVTHPV TVDLSAVAVS INAITADDVI 
KTYSATVAAN GSWSTSVPAA DMAALRDGDA 
INTIAGDDIL NAAEAGAAIiT ITGSSTAEAG 
LSTLTASNYT VNAAVSDKAG NPASWHNLT 
SGSATGAATG STVTVTIGTN TFTTVLDASG 

25 NSGSATHQVT VNTGLPTITF NAISGDNILIST 
NYSATTDASG NWTIiTVPVSD LAALGQANYT 
NTVAGDDIIN AAEAGADQTI SGWTRAAAG 
LQALGNGDLT ITASVTNANG NTGSGTRDIT 
TGGSSGLNAG AVLTVTINSV AYSATVQADG 

30 MPVSVSHPFT VDLTAVAISI NTVASDDVIN 
TYTASVAANG SWSVNVPAAD LiATIiPEGAAN 
NTIASDDILN AAEAGSPLTX SGTSTAETGQ 
GALMASNYTV SATVNDKAGN PGSASHWLAV 
GTSSGGEAGD WSWLNGKT YTTTLDASGN 

35 SDDASRTVTV SLSAPVISIN" TIAGDDVINA 
SATTDASGIiilW SVTVPASAVS ALGEATYSVT 
VATDDIINAS EAGSAQTISG QVTGAAAGST 
ALGNGELTVN ASVTNAVGNT GSGTRDITID 
SSSGFAAGTA LTWINNQTY AATVLANGSW 

40 TSITHPLTVD LTAVAISMNS ITSDDAINAA 
TTTVAANGSW STTVPAADLA ALRDGDASAQ 
lASDNIINAS EAAAGVTVSG TSTAQTGQTL 
LANNGYTLTA TVSDLAGMLG SASKGVTVDT 
ATGAVAGDRL WTIAGQQYV TSTDASGNWS 

45 TQTHNVQVNT AAVSLSVSTI SGDNLINAAE 
ATIQSNGSWS VNVPAADVAA LSDGTSYTVS 
STISGDNLIN AAEAGSALTL SGTGTNFATG 
VAALSDGTSY TVSASAQDSA GNSATASRSV 
LNGSTSAEVG QTVTVTFGGK TYTATVAANG 

50 NPGQATHALT VDTVAPTVTI ATVAGDDIIN 
WSATVGSGGS WSVFIPAQQF AGLSDGSYTI 
TFAGDDWKTA AEHGSSLVIS GTTTAPVGQT 
ALADGNAYVI NASVSNAIGN TGSSNHTITV 
PVWNGSLTA ALASNETAQI SIDGGTTWTT 

55 AGNVGATDSQ NWIDTTAPD PAVKTIAISA 
AGEFAQISIiD GGVTWTTLTV VGTSWSYADG 
VDTTSPEAAK SITITGISDD TGTSSSDFIT 
WVNVTVAADS LNWSYVDGRT LTNGTTTWQV 
ASISTDTGSS ATDFITSDTM LTLTGSLGAG 

60 DSRTLTDGSY VYQVRVLDLA GNTGPWSKT 
SQATDDTTPL IiNGVLSAPLA SGBWYLYRN 
ARWDIiAGNI TSSSDFVLTV DTSIPTTLAQ 
INGKTYTSEP GGAVWDPAH NTWYVQLPDT 
GTVTVNAAID YTPTWTTASK TTAWGLTYGL 

65 YQSGNNYATS SIADYDRNGT GDLFITRDDY 
GSIVAFDKEG DGYLDFWIGD AGGPDSNTFL 
SLNEGSGVDL NNDGRIDLVQ HTYNLNNYYT 



TVNITVSGTN SAGTTSTITH PVTVDLAGVA 840 
EAGQTVTVTF GGKNYTTTVE AHGSWTVNVP 900 
AYSVDATAPL VTINTIASDD ILNVSEAGAG 960 

DGTWSVNVPA ADLSGLTASS YTVTATVSDK 1020 

INAIEHGOAIi WSGTSTGAA AGDWTVTLN 1080 

QTITASIiSDR AGNSDSTTHD VTVDLSGPTL 1140 

VAGNTVIVTI GGNQYNATVQ SDLSWSVSVP 12 00 

DIVIDAISTLPG LRVDTVAGDD VINSIEHTQA 1260 

ADGSWSVGVP VADVTNWPAG TVNIAVSGTN 1320 

VINAAEKGSD LQLSGTTSGV EAGQTITVIF 13 80 

AANVQASVSN VAGNSTQATH AYSVDATAPS 1440 

AGQTVTVTLN GVNYSGNVQA DGSWSVSVPT 1500 

LTVDIiAAPW TINTVAGDDI INATEHGQAQ 1560 

NGNWSIGVPA SVISALAQGD VTITATVTDS 1620 

INAAEKGADL AITGTSNQPA GTQITVTLNG 1680 

TVTAAATDAD GNSGSASHNV QVNTALPGVT 1740 

GDTVTVTLGG ATYTATVQAN LSWSVDVPAS 1800 

TIDANIiPGLR VDTVAGDDW NIIEHGQALV 1860 

GTWSVGVPAV DVSAWPAGSV TIAASGSTSA 1920 

NAAEKGAALT LSGSTSGVEA GQTVTVTFGG 1980 

SAQASVSNVN GMSATTTHAY SVDASAPTVT 2040 

QTVTVTLNGT NYTGTVQTDG SWSVSVPSAD 2100 

VDTSVPWTI NTVAGDDVIN ATEHAQAQII 2160 

NWSVGVPASV VSAIiANGTVT INASVTDAGG 2220 

ADEKGQPLTI SGGSTGLATG AQVTVTLNGH 2280 

VSASATSAAG NTASSQANLL VDSGLPDVTI 2340 

DTVTVTLGGN TYTATVQSNL SWSVSVPTAD 24 00 

IDANLPGLRV DTVAGDDIVN SIEHGQALVI 24 60 

SWSVGIPAAN VSAWPAGPLT VEVDGQSSAN 2520 

AAEKGTNLTIi SGSTSGIESG QTVTVTFGGK 2580 

VQASVSSASG NSASATHAYS VDASAPTLTI 2640 

TVTVTLNGAT YTGTVQADGS WSVSVPTSAL 2700 

DTTAPVIiTIN TVAGDDIIND AEHAQALVIS 2760 

WSVGVPAADV TAIiGSGAQTI TASVSDRAGN 2820 

TEKGSDLAIiS GTSDQPAGTA ITVTLNGQNY 2880 

ASVTNAQGNS STASHNVQVN TALPGITINP 2940 

VTVELGGKTY TATVQADLSW NVSVPAADWQ 3 000 
ASLPGLRVDT VAGDDWNII EHAQAQVITG * 3 060 

SVGVPATDVS NWPAGTLNIT VSGANSAGTQ 3120 

EKGAAIiTLSG STSGVEAGQT VTVTFGGKTY 3180 

VRVTNVNGNS ATATHEYSVD SAAPTVTINT 3240" 

TVTLNGTNYQ TTVQTDGSWS LTLPASDLTA 33 00 

TAPVISFNTV AGDDVINWVE HIQAQIISGT 33 60 

VGVPASVISG LADGTVTISA TITDSAGNSS 3420 

AGSALTLSGT GTNFATGTW TVLLNGKGYS 3480 

ASAQDSAGNG NSSTQTHNVQ VNTAAVSLSV 3 54 0 

TWTVLLNGK GYSATIQSNG SWSVNVPAAD 3 60 0 

AVDLTAPVIS INTVSTDDRL NAAEQQQPLT 3 660 

TWALNVPAVD LAALGQGAQT ITASVNDRAG 3 720 

NAEQLAGQTI SGTTTAEVGQ TVTVTFNGQT 3780 

SATVSDQAGN PGSASRGVTL NGDVPTVTIN 3 84 0 

LTIiTLMGKTY TTTVQTGGSW SYTLGSADVT 3900 

DLSAPAMGIN IDSLQADTGL SASDFITSVS 3960 

LTVTGTTWRY NDSRTLTDGN YLYQVRVIDA 402 0 

ITTDMGLITN DFVTSDTTLA VSGTLGATLS 4080 

HTLTDGTWNY TVRWDLAGlSr VGQTATQNW 4140 

SDTTLTVRGV LGAALGANEF AQISTDNGAT 4200 

RWDLAGNVG ATSSQSALID TVNPAQVLTI 4260 

LASGEVAQIS LDSGATWTTL TTNGTQWTYT 4320 

VWDTINPTA TPTIVSYTDD VGQRQGTLSS 4380 

GLLLGAVTMV 6ALNWTYSDS GLVSGAYTYS 4440 

ITSQTTRDTT PIISGVITAA LASGQYVBW 4500 

DALTVSATAY TVTAQVKSSA GNGNNANISN 4560 

DSHGMWTVLA NQQVMQSTDP LTWSKTALTL 4620 

GTGYINGFTN NGDGTFSSAI QVTVGTLTWY 4680 

WNNAGTLVGN STTSNSGGSA TVGGAVTGYL 4740 

IiSSIiINQGNG TFVWGQNTTN TFLSGAGSGA 4800 
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MSSSVSMTWA DFDGDGDMDL FLPASQGRAN YGSLLFNTNG VLGGPVAVGA TATTYASQFS 4860 

LAVDWNHDGIi MDIARIAQTG QSYIiYTNVSN ASNWTQSALG GSQSGTTSGV AAMDYDWDGA 4920 

VDVLVSKQSG SVFLSRNTNT VSYGTSLHLR ITDPNGINVY YGNTVKLYNS AGVLVATQII 4980 

NPQSGMGVND TSAIiVNFYGL NAGETYNAVL IKSTGTTASN IDQTVNTSWG GLQATDATHA 5040 

5 YDLSAEAGTA SNNGKFVGTG YNDTFFATAG TDTYDGSGGW VYSSGTGTWL ANGGMDWDF 5100 

RLSTVGVTAN LSSTAAQATG FNTSTFTNIE GISGSNFNDI LTGSSGDNQL EGRGGNDTLN 5160 

IGNGGHDTLL YKLLNASDAT GGNGSDWNG FTVGTWEGTA DTDRIDIREL LQGSGYTGNG 5220 

KASYVNGVAT LDAQAGNIGD FVKVTQSGSD TIVQIDRDGT GGTFATTNW TLTGVHTDLA 5280 

TLLANHQLMV V 5291 
10 <212> Type -. PRT 

<211> Length : 5291 

SequenceName : SEQ ID 281 

SequenceDescription : 

1 J Sec[uence 



<213> OrganismMame : Escherichia coli 0157:H7 
<400> Pre Sequences t ring : 

MGVHTAEATL PNGNNDTKIV NIAPDASNAQ VTLNIPAQQV VTNNSDSVQL TATVKDPSNH 60 

20 PVAGITVNFT MPQDVAANFT LENNGIAITQ ANGEAHVTLK GKKAGTHTVT ATLSNISTNTSD 12 0 

SQPVTFVADK TSALWLQIS KNEITGNGVD SATLTATVKD QFDNEVNNLP VTFSTASSGL 18 0 

TLTPGESNTN ESGIAQATIiA GVAFGEQTVT ASLANNGASD NKTVHFIGDT AAAKIIELTP 240 

VPDSIIAGTP QNSSGSVITA TWDNITGFPV KGVTVNFTSN AATAEMTNGG QAVTNEQGKA 3 00 

TVTYTNTRSS lESGARPDTV EASBENGSST LSTSINVNAD ASTAHLTLLQ ALFDTVSAGD 360 

25 TTNLYIEVKD NYGNGVPQQE VTLSVSPSEG VTPSNNAIYT TNHDGNFY-AS FTATKAGVYQ 42 0 

VTATLENGDS MQQTVTYVPN VANAEISLAA SKDPVIANNN DLTTLTATVA DTEGNAIANS 48 0 

EVTFTLPEDV RANFTLGDGG KWTDTEGKA KVTLKGTKAG AHTVTASMAG GKSEQLiWNF 540 

lADTLTAQVN LNVTEDNFIA NNVGMTRLQA TVTDGNGNPL ANEAVTFTLP ADVSASFTLG 60 0 

QGGSAITDIN GKAEVTLSGT KSGTYPVTVS VNNYGVSDTK QVTLIADAGT AKLASLTSVY 660 

30 SFWSTTEGA TMTASVTDAN GNPVEGIKVN FRGTSVTLSS TSVETDDRGF AEILVTSTEV 720 

GLKTVSASLA DKPTEVISRL LNAKADINSA TITSLEIPEG QVMVAQDVAV KAHVNDQFGKf 78 0 

PILNESVTFS AEPPEHMTIS QNIVSTDTHG lAEVTMTPER NGSYMVKASL ANGSSYEKDL 840 

WIDQBCLTLS ASSPLIGWS PTGATLTATL TSANGTPVEG QVINFSVTPE GATIaSGGKVR 9 00 

TNSSGQAPW liTSNKVGTYT VTASFHNGVT IQTQTIVKVT GNSSTAHVAS FIADPSTIAA 960 

35 TNSDLSTLKA TVEDGSGNLI EGLTVYFALK SGSATLTSLT AVTDQNGIAT TSVRGAITGS 1020 

VTVSAVTTAG GMQTVDITL,V AGPADASQSV LKNNRSSLKG DFTDSAELHL VLHDXSGNPI 10 8 0 

ICVSEGLEFVQ SGTNAPYVQV SAIDYSKNFS GEYKATVTGG GEGIATLIPV LNGVHQAGLS 1140 

TTIQFTRAED KIMSGTVLVN GANLPTTTFP SQGFTGAYYQ LNNDNFAPGK TAADYEFSSS 12 00 

ASWVDVDATG KVTFKNVGSK WERITATPKT GGPSYIYEIR VKSWWVNAGD AFMIYSLAEN 12 60 

40 FCSSNGYTLP LGDHLNHSRS RGIGSLYSEW GDMGHYTTEA GFHSNMYWSS SPAISTSNEQYV 1320 

VSLiATGDQSV FEKLGFAYAT CYKNL 1345 
<212> Type : PRT 
<211> Length : 1345 

SequenceName : SEQ ID 282 

45 SequenceDescription : 

Sequence 



<213> OrganismName : Escherichia coli 0157 :H7 

50 <400> PreSequenceString : 

MSLIIDVISR KTSVKQTLIN PGDVTWIYE PSWQVHAQA SAVARYVREG NDLLIYMQDG 60 

TVIRCNGYFL QAANTAEQSE LVFADGQQLT HITFADTAAG GLAPVELTAQ TTAIESIAPF 120 

LDTVAQTSAF PWGWLAGAAV GGGALGALLA SGGDGDSKTE VINNPTPPAE PGISTATPSFLV 180 

TDNQGDQRGI LATNDITDDT TPTFSGSGQA GATIQIKDSN GNTIASTQVD NNGHWSVSLP 240 

55 TQSAGEHTWS WQIVGSTIT DAGSITLTID NSQASVQVAT TAGDNIINAS EQAAGFTLSG 300 

TSSHLAQGTE LTVTLNGKTY TTSVGANGAW SVQVPTADAQ ALGEGNQAVL VSGKDATGNT 3 60 

VTGAQLLTVD TQPPTLAINT lAQDNIISAA EHNVALVLSG TSNAEAGQTV TLTVNGKSHT 420 

ATVGSDGTWQ VTLPAXEVQA LAEGNYAVNA SVSDRAGNTT SHSANFTVDT SAPWSVNTV 480 

AGDDILNNAE QAVAQIXSGQ VSGASPGDTV TVKLGTHVLT GIVLADGSWN VALDPAVTRT 540 

60 LDRGANTIFV TVTDAAGNTG AASRAITLVG VSPLITINTV SGDDIISGAE KGAPLTLTGS 600 

• TQQAETGQTV TVTLAGQSFT TTVQADGSWS LTVPAAAMGN LPDGAVAITA SVTDLSGNTG 660 

NTSRTITVDS QAPALSIDPL TADNIINAAE SGQDLPITGT TDAQPGQTVT VTLNGQTYQG 72 0 

WQPDGTWSV TVPAANVGAL ADGNATVTAS VNDVAGNPSS VSRVALVDAT PPWTINPVA 7 80 

TDNVINTPEH AQAQIISGTV TGAQAGDIVT VTLNNVDYTT WDGSGNWSL GVPASWSGL 840 

65 ADGSYPVSVS VTDKAGNTGS QSLTVTVNTA APLIGINSIA GDDVINASEK GADLQITGTS 900 

DQPVNTAITV TLNGQNYTTT TDASGNWSVT VPASAVTALG QANYTVTAAV TSDI6NSATA 960 

SHNVLVDSAL PGVTINPVAT DDIINAABAG VAQTISGQVT GAEDGDTVTI TLGGNTYTAT 1020 
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VGSNLTWSVD VPAADIQALG NGDLTVNASV TNQNGNTGSG TRDITIDANL PGLRVDTVAG 108 0 

DDWNIIEHG QALWTGSSS GLAESTPLTV TINNVEYTTA VQADGSWSVG VTAAQVSAWP 1140 

AGTVNIAVSG ESSAGNSVSI THPVTVDLTP AAITINTIAT DDVINAAEKG ADLTLSGTTT 12 00 

NVEPGQTVTV TFGGKNYTAS VASDGSWTAT VPAADLASLP EGSASALASV SNINGNSASA 1260 

5 VHNYSVDSSA PTIIINTVAS DNIVNASEAD AGVTVSGSTT AEAGQIVTIT LNSPTVQTYQ 13 20 

ATVQADGSWS INIPAADLEA LTDGSHTLTA TVNDKAGKPA STTHNLAVDL TVPVLTINTI 13 80 

AGDDIINATE HGQALVISGS STGGEAGDW TVTLNSKTYT TTLDASGNWS VGVPAADVTA 1440 

LGSGPQTVTA TVTDAAGNSD N 1461 
<212> T^Tpe : PRT 
10 <211> Length : 1461 

SequenceNaine : SEQ ID 283 

SequenceDe script ion : 

Sequence 

15 

<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

MNRIYRVIWN CTLQVFQACS ELTRRVGKTS TVNLRKSSGL TTKFSRLTLG VLLALSGSVS 60 

GASLEVDNGQ ITNIDTDVAY DAYLVGWYGT GVLNILAGGN ASLTTITTSV IGGNEDSEGT 12 0 

20 ViJVLGGTWRL YDS6NNARPL NVGQSGTGTL NIKQKGHVDG GYLRLGTQAA GVGTVOTEGE 18 0 

DSVLTTELFE IGSYGTGSLN ITDKGYVTSS IVAILGYQAN SNGKWVEKG GEWLIKNNDS 240 

SIEFQIGNQG TGEATIREGG LITAENTIIG GNATGVGTLN VQDQDSVITV RRLYNGYFGN 3 00 

GAVNISNNGL INNKEYSLVG VQDGSHGWN VTDKGHWNFL GTGEAFRYIY IGDAGDGELN 3 60 

VSREGKVDSG IITAGMKETG TGNLTVKDKN SVITNLGTNL GYDGHGEMNI SNEGIiWSNG 42 0 

25 GSSLGYGETG VGKVSITTGG IWEVNKNVYT TIGVAGVGNL NISDGGKFVS QNITFLGDKA 480 

SGIGTLNLMD ATSSFDTVGI NVGNFGSGIV NVSNGATLNS TGYGFIGGNA SGKGIVWIST 54 0 

DSLWNLKTSS TNAQLLQVGV LGTGELNITT GGIVKARDTQ lALNDKSKGD VRVDGQNSLL 600 

ETFNMYVGTS GTGTLTLTNS GTLNVEGGEV YLGVFEPAVG TLNIGAAHGE AAADAGFITN 660 

ATiCVEFGSGE GVFVFNHTKN SDAGYQVPML ITGDDKDGKV IHDAGHTVFN AGNTYSGKTL 72 0 

30 VNDGLLTIAS HTADGVTGMG SSEVTIASPG TLDILASTMS AGDYTLTNAL KGDGLMRVQL 780 

SSSDKMFGFT HATGTEFAGV AQLKDSTFTL ERDNTAALTH AMLQSDIENT TSVIJVGEQSI 840 

GGLAMNGGTIi IFDTDIPAAT LAEGYISVDT LWGASDYTW KGRNYQWGT GDVLIGVPKP 900 

WNDPMANNPn TTLNIiLEHDD NHVGVQLVKA QTVIGSGGSL TLRDIiQGDEV EADKTLHIAQ 960 

NGTWAEGDY G^-RLTTAPGD GLYVNYGLKA LNIHGGQKLT LAEHGGAYGA TADMSAKIGG 1020 

35 EGDLAINTVR QVSLSNGQND YQGATYVQMG TLRTDADGAL GNTRELNISM AAIVDMGST 1080 

QTVETFTGQM GSTVLFKEGS LTVNKGGISQ GELTGGGNLN VTGGTLAVEG LNARYNALTS 1140 

VSPNAEVSIiD NTQGLGRGNI ANDGLLTLKN VTGELRNSIS GKGIVSATAR TDVELDGDNS 12 00 

RFVGQFNIDT GSAIiSVXTEQK NLGDASVINN GLBTISTERS WAMTHSISGS GDLTKLGTGI 1260 

LTLNNDSSAY QGTTDIVGGE lAFGSDSAIKT TASQHINIHN SGVMSGNVTT AGDVNVMSGG 1320 

40 TLRVAKTTIG ESAATWRMAA RFK 1343 
<212> Type : PRT 
<211> Length : 1343 

SequenceName : SEQ ID 284 
SequenceDescription : 

45 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

50 MGIKQHNGNT KADRLAELKI RSPSIQLIKF GAIGLNAlLF SPLLIAADTG SQYGTNITIN" 60 
DGDRITGDTA DPSGNLYGVM TPAGNTPGNI NLGNDVTVNV NDASGYAKGI IIQGKNSSLT 120 
ANRLTVDWG QTSAIGINLI GDYTHADLGT GSTIKSNDDG IIIGHSSTLT ATQFTIENSM 180 
GIGLTINDYG TSVDLGSGSK IKTDGSTGVY IGGLNGNNAN GAARFTATDL TIDVQGYSAM 240 
GINVQKNSW DLGTNSSIKT SGDNAHGLWS FGQVSANALT VDVTGAAANG VEVRGGTTTI 3 00 

55 GADSHISSAQ GGGLVTSGSD ATINFSGTAA QRNSIFSGGS YGASAQTATA VINMQNTDIT 3 60 

VDRNGSLALG LWALSGGRIT GDSLAITGAA GARGIYAMTN SQIDLTSDLV IDMSTPDQMA 420 
lATQHDDGYA ASRINASGRM LINGSVLSKG GLINLDMHPG SWTGSSLSD NVNGGKLDVA 480 
MNNSVWNVTS NSNLDTLALS HSTVDFASHG STAGTFTTLN VENLSGNSTF IMRADWGEG 540 
NGVKPWA 547 

60 <212> Type : PRT 

<211> Length : 547 

SequenceName : SEQ ID 285 
SequenceDescription : 

65 Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
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<4 00> PreSequenceString : 

MGIDSRNDIP EGIATLGAFM GYSHSHIGFD RGGHGSVDSY SLGGYASWEH ESGFYLDGW 60 

KLNRFESNVA GKMSSGGAAN GSYHSNGLGG HIETGMRFTD GNWNLTPYAS LTGFTADNPE 120 

YHLSNGMESK SVDTRSIYRE IiGATLSYNMR LGNGMEVEPW LKAAVRKEFV DDNRVKVNSD 180 

5 GNFVNDLSGR RGIYQAGIKA SPSSTLSGHL GVGYSNGAGM ESPWNAVAGV NWSP 234 



<212> Type : PRT 
<211> Length : 234 

SequenceName : SEQ ID 286 
10 SequenceDe script ion : 

Sequence 



<213> OrganistnName : Escherichia coli 0157:H7 - 
15 <400> PreSequenceString : 

MKKKVLAIAIi VTVFTGMGVA QAADVTAQAV ATWSATAKKD TTSKLWTPL GSLAFQYAEG 60 
IKGFNSQKGL FDVAIEGDST ATAFKLTSRL ITNTLTQLDT SGSTLNVGVD YNGAAVEKTG 120 
DTVMIDTANG VLGGNLSPLA NGYNASNRTT AQDGFTFSII SGTTNGTTAV TDYSTLPEGI 180 
WSGDVSVQFD ATWTS ^ 195 

20 <212> Type : PRT 

<211> Length : 195 

SequenceName : SEQ ID 287 

Sec[uenceDescription : 

25 Sequence 



<213> OrganistnName : Escherichia coli 0157:H7 

<400> PreSequenceString : 

MTAESYDDNY LDDEDADWTA TGQGQKSAGD TSFTLAWKPG EEGQKGLIGW FESGDVRAYK 60 
30 IRFPNGTVDV FRGWVSSIGK AVTAKEVITR TVKVTNVGKP SVAEERSKIT PVSAIKVTPT 120 

SGTVAKGKTT TLTVSFEPES ATDKTFRAVS ADPSKATISV KDMTITVNGV ATGKVQIPW 180 

SGNGQFAAVA EVTVTEAGAA G 201 

<212> Type t PRT 

<211> Length : 201 
35 SequenceName : SEQ ID 288 

SequenceDescription : 

Sequence 



40 <213> OrganismName i Escherichia coli 0157:H7 
<40 0> PreSequenceString : 

MTAESYDDNY LDDEDADWTA TGQGQKSAGD TSFTLAWKPG EEGQKGLIGW FESGDVRAYK 60 

IRFPNGTVDV FRGWVSSIGK AVTAKEVITR TVKVTNVGKP SVAEERSKIT PVSAIKVTPT 120 

SGTVAKGKTT TLTVSFEPES ATDKTFRAVS ADPSKATISV KDMTITVNGV ATGKVQIPW 180 

45 SGNGQFAAVA EVTVTEAGAA G 2 01 



<212> Type : PRT 

<211> Length : 201 

SequenceName , : SEQ ID 289 
SequenceDescription : 

50 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

55 MLYNIPCRIY ILSTLSLCIS GIVSTATATS SETKISNEET LWTTNRSAS NLWESPATIQ 60 

VIDQQTLQNS TNASIADNLQ DIPGVEITDN SLAGRKQIRI RGEASSRVLI LIDGQEVTYQ 120 

RAGDNYGVGL LIDESALERV EWKGPYSVL YGSQAIGGIV NFITKKGGDK LASGWKAVY 180 

NSATAGWEES lAVQGSIGGP DYRINGSYSD QGNRDTPDGR LPNTNYRNNS QGVWLGYNSG 240 

NHRFGLSLDR YRLATQTYYE DPDGSYEAFS VKIPKLEREK VGVFYDTDVD GDYLKKIHFD 3 00 

60 AYEQTIQRQF ANEVKTTQPV PSPMIQALTV HNKTDTHDKQ YTQAVTLQSH FSLPANNELV 3 60 

TGAQYKQDRV SQRSGGMTSS KSLTGFINKE TRTRSYYESE QSTVSLFAQN DWQFADHWTW 42 0 

TMGVRQYWLS SKLTRGDGVS YTAGIISDTS LARESASDHE MVTSTSLRYS GFDNLELRAA 4 80 

FAQGYVFPTL SQLFMQTSAG GSVTYGNPDL KAEHSNNFEL GARYNGNQWL IDSAVYYSEA 540 

KDYIASLICD GSIVCNGNTN SSRSSYYYYD NIDRAKTWGL EISAEYNGWV FSPYISGNLI 600 

65 RRQYETSTLK TTNTGEPAIN GRIGLKHTLV MGQANIISDV FIRAASSAKD DSNGTETNVP 660 

GWATLNFAVN TEFGNEDQYR INLALNNLTD KRYRTAHETI PAAGFNAAIG FVWNF 715 
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<212> Type : PRT 

<211> Length : 715 

SequenceNarae : SEQ ID 290 
Sec[uenceDe script ion : 

5 

Seq[uence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

10 MTKMSRYALI TALAMFLAGC VGQREPAPVE EVKPAPEQPA EPQQPVPTVP SVPTIPQQPG 60 
PIEHEDQTJ^ PAPHIRHYDW NGAMQPMVSK MLGADGVTAG SVLLVDSVNN RTNGSLNAAE 120 
ATETLRNALA NNGKFTLVSA QQIiSMAKQQL GLSPQDSLGT RSKAIGIAKKT VGAHYVLYSS 180 
ASGNVNAPTL QMQLMLVQTG EIIWSGKGAV SQQ 213 
<212> Type : PRT 

15 <211> Length : 213 

SequenceName : SEQ ID 291 
Sec[uenceDescription : 



Sec[uence 

20 

<213> OrganismName : Escherichia coli 0157 :H7 
<40 0> PreSequenceString : 

MKSKVLALLI PALLGAGAAH AAEVYNKDGM KLDLYGKVDG LHYFSDNSAK DGDQSYARLG 60 
FKGETQINDQ LTGYGQWEYN IQANNTESSK NQSWTRLAFA GLKFSDYGSF DYGRNYGLDR 120 
25 YAA 123 
<212> Type : PRT 
<211> Length : 123 

SequenceName : SEQ ID 292 

SequenceDescription : 

30 

Sequence 



<213> OrganismName : Escherichia coli 0157 :H7 
<400> PreSequenceString i 

35 MATPNPLEPV KGAGTTLWVY NGKGDAYANP LSDDDWQRLA KVKDLXPGEM TAEPYDDNYL 60 
DDEDADWTAT GQGQKSAGDT SFTLAWKPGE EGQKGLIGWF ESGDVRAYKI RFPNGTVDVF 120 
RGWVSSIGKA VTAKEVITRT VKVTNVGKPS VAEERSEITP ATAIKVTPTS GTVAKGKTTT 180 
LTVSFEPESA TDKTFRAVSA DPSKATISVK DMTITVNGVA TGKVQXPWS GNGQFAAVAE 240 
VTVTEAGAAG 250 

40 <212> Type r PRT 

<211> Length : 250 

SequenceName : SEQ ID 293 
Sec[uenceDescription : 

45 Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<40 0> PreSequenceString : 

MATPNPLEPV KGAGTTLWVY NGKGDAYANP LSDDDWQRLA KVKDLXPGEM TAEPYDDNYL 60 
50 DDEDADWTAT GQGQKSAGDT SFTLAWKPGE EGQKGLIGWF ESGDVRAYKI RFPNGTVDVF 120 
RGWVSSIGKA VTAKEVITRT VKVTNVGKPS VAEERSEITP ATAIKVTPTS GTVAKGKTTT 18 0 

LTVSFEPESA TDKTFRAVSA DPSKATISVK DMTITVNGVA TGKVQXPWS GNGQFAAVAE 240 
VTVTEAGAAG 250 
<212> Type : PRT 
55 <211> Length : 250 

SequenceName : SEQ ID 294 
SequenceDescription : 



60 



Sequence 



<213> OrganismName : Escherichia coli 0157 :H7 
<400> PreSequenceString : 

MGWTDMLPEF GGDSYTNADN FMTGRANGVA TYRNTDFFGL VNGLNFAVQY QGNNEGASNG 60 

QEGTNNGRDV RHENGDGWGL STTYDLGMGF SAGAAYTSSD RTNDQVNHTA AGGDKADAWT 120 

65 AGLKYDANNI YLATMYSETR NMTPF6DSDY AVANKTQNFE VTAQYQFDFG LRPAVSFLMS 180 

KGRDLHAAGG ADNPAGVDDK DLVKYADVGA TYYPNKNMST YVDYKIIILLD EDDSFYAANG 240 

ISTDDIVALG LVYQF 255 
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<212> Type : PRT 

<211> Length : 255 

SequenceName : SEQ ID 295 
SequenceDescription : 

Sequence 



<213> OrganismName : Haemophilus infliaenzae Rd 
<4 00> PreSequenceString : 

MGFIMKLTKT ALCTALFATF TFSANAQTYP DLPVGIKGGT GALIGDTVYV GLGSGGDKFY 60 
TliDLKDPSAQ WKEIATFPGG ERNQPVAAAV DGKIi"YVFGGL QiaSIEKGELQL VNDAYRYNPS 12 0 

DNTWMKLPTR SPRGLVGSSG ASHGDKVYIL GGSNXaSIFNG FFQDTVAAGE DKAKKDEIAA 18 0 

AYFDQRPEDY FFTTELLSYE PSTNKWRNEG RIPFSGRAGA AFTIQGNELV WNGEIKPGL 24 0 

RTAETKQGKF TAKGVQWKNIi . PDLPAPKGKS QDOI^GALSG YSNGHYLVTG GANFPGSIKQ 3 00 

FKEGKLHAHK GLSKAWHNEV YTLNNGKWRI VGELPMNIGY GFSVSYNNKV LIjIGGETDGG 360 
KALTSVKAIS YDGKKLTIE 379 
<212> Type : PRT 
<211> Length : 3 79 

SequenceName : SEQ ID 296 

SequenceDescription : 

Sequence 



<213> OrganismName : Haemophilus 
<4 00> PreSequenceString : 
MGEQYMLTTI LSFLIVTTW AYVSWLKTKG 
STEQLIGWA VSYKGNFSVI AWTVPTVIPL 
<212> Type : PRT 
<211> Length : 101 

SequenceName : SEQ ID 297 
SequenceDescription r 

Sequence 



<213> OrganismName : Helicobacter pylori iT99 
<400> PreSequenceString z 

MKNQHKNPLT KALMKTYPYN HFLFFCFILG AFLLGLLSPA YALSIITTKE IDANLLNGAI 60 

ESRWLGKRV FKVEAHGFYF RNNATNSIDI EITSLLRDNQ SFPLTSSAKT SLKIPPNAKI 120 

KKSTILVLKG ENAEEVAKIL GVSKIIEYQKL ENIAQTKAAN DPMYANTPFS NGSDSSFYDN 180 

NPNSPSNNAI NGKDGANGSN GYGANGNDGV NGISGSNGAN GSHSNNNAIG SGIDTDGVLG 240 

VDGVNGSSSS SGGSVGGYEN NFTNHGSTNN NTGG^DNFNN GSSSGGSLGN GGLFPIPFGN 300 

GDTNNSNNST NTTSPTNGSS SNNATNPSSQ ENNYSSQYCK VPELSPNNTM KLDVIAKDGS 360 

CISMNALRDD TKCAYRYDFE AGKAIKQTQY YYVDRENKTQ NIGGCVDLQG AQYAMQLYKD 420 

DSKCALQTTS DKGYGMGKTQ TFQTEIVFRG MDNLIHVAVP CSDYARVQDR IVRYEKNDKT 480 

QTLTPIVDQY YNDPNNPNKQ EILNRGIATQ LSSQYQEFAC GQWEYNDAKL EAKRPTMLKS 540 

YNKLNGEWVE VTPCNFEAGI KSGAWSPYV MGVPSSKVLS DITTSHYPRI ERKNYGEREQ 600 

CQKLYGVNRC QPQYSILILV SPIGAPLTKP LPPKPLNLIY AQPKIMKNTP QPIILSPLKP 660 

PSTGLKAF 668 
<212> Type : PRT 
<211> Length : 668 

SequenceName : SEQ ID 298 

SequenceDescription : 

Sequence 



<213> OrganismName : Helicobacter pylori iJ99 
<400> PreSequenceString : 

MPVIRVLVML ATMMMKLVKT AKEKKVFKNV GISIMGIAFW EAIKDSIKKQ IKKSDWICGN 60 
VKTADDYLKT HPNSWFNSAI GVTAITAMLM NVCFADDQSK KEVAQAQKEA ENARDRANKS 120 
GIELEQEEQK TEQEKQKTEQ EKQKTEQEKQ KTEQEKQKTE QEKQKTSNIE TNNQIKVEQE 18 0 

QQKTEQEKQK TNNTQKDLVN KAEQNCQENH NQFFXKKLGI KAGIAIEIEA ECKTPKPTKT 240 
NQTPIQPKHL PNSKQPHSQR GSKAQELIAY LQKELESLPY SQKAIAKQVD FYRPSSIAYL 300 
ELDPRDFNAT EEWQKENLKI RSKAQAKMLE MRSLKPDPQA HLSTSQSLLL VQKIFADVSK 360 
EIKWANTEK KVEKAGYGYS KRM 383 
<212> Type : PRT 
<211> Length : 383 

SequenceName : SEQ ID 299 



infliaenzae Rd 

DDLKSSKGYF LAGRGLSGLV IGCSMVLTSL 60 
CPLALYIIGW L 101 
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SequenceDescrlption : 
Sequence 

5 <213> OrganismName : Helicobacter pylori J99 
<400> PreSequenceString : 

MNYPNLPNSA LEISEQPEVK EITNELLKQIi QNALRSNAHF SEQVELSLKC IVRILEVLLS 60 
LDFFKNANEI DSSLRNSIEW LTNAGESLKL KMKEYERFFS EFNTSMHANE QEVTNTLNAN 120 
AENIKSEIKK LENQLIETTT RLLTSYQIFL NQARDNANNQ ITKNKTQSLE AITQAKtsINAN 180 

10 NEISNNQTQA ITNITEAKTN ANNEISKNQT QAITNINEAK ESATTQINAN KQEAINNITQ 240 
EKTQATSEIT EAKKTDHYQN IDPFEFE 267 
<212> Type : PRT 
<211> Length : 267 

SequenceName : SEQ TD 3 00 

15 SequenceDescription : 

Sequence 



<213> OrganismName r Helicobacter pylori J99 

20 <400> PreSequenceString : 

MKFFSKDLFK KVTPLFLSVY FLSPTLTQAK SRFYVASQYQ VGIOyilMKKYN DLKRTIEGAS 60 
FSLGWEINPT NYWFYSRYYF FMDYGNVILN KRTGAQANMF TYGFGGDLIM EYNKNPIiYVF 120 
SLFYGMQVAE NTWTISKHSA NFIIDDWRSI QGFSLKTSNF RMLGLVGFKF QTVIjPHHDAS 180 
lEVGIKWPFA FEYDSPFVRIi FSVFISHTFY L 211 

25 <212> Type : PRT 

<211> Length : 211 

SequenceName : SEQ ID 301 
SequenceDescrxption : 

30 Sequence 



<213> OrganismName ~ Helicobacter pylori J99 
<4 00> PreSequenceS tiring : 

MKKFTLSLFL CCTLLNAEED XFRNNTNETD LTNSFEHGKE NNNLIPAKSD SLESFKEQEN 60 
35 KEKAKQLMDL KALQSVYFSK NRKLQDNNFN VLYVAGNTNK IRLRYAMTTT FIFDNDPIIY 120 
VSLGDPSDFE LTYPTNDHYD LSNMLVIKPL LIGVDTNLTV VGASGTIYTL LFV 173 

<212> Type : PRT 
<211> Length : 173 
40 SequenceName = SEQ ID 302 

SeguenceDescirlpfcion : 

Sec[uence 



45 <213> OrganismName : Mycoplasma pneumoniae 
<400> Pre Sequences t ring : 

MLDYVPWIGN GYRYGNNHRG SMSSTSGVTT QGQSQNASSN EPAPTFSNVG VGLKANVNGT 60 

LSGSRTTPNQ QGTPWLTLDQ ANLQLWTGAG WRNDKNGQSD ENYTNFASAK GSTNQQGSTT 120 

GGSAGNPDSL KQDKADKSGD SVTVAEATSG DNLTNYTNLP PTSPPHPTDR TRCHSPTRTT 180 

50 PSGCSCSCAA CWAASRCWSI RVGKMITVSL IPPTKNGLTP N 221 



<212> Type : PRT 

<211> Length : 221 

SequenceName : SEQ ID 303 
SequenceDescription : 

55 

Sequence 



<213> OrganismName : Mycoplasma pneumoniae 
<400> PreSequenceStxing : 

60 MDDITAPQTS AGSSSGTSTN TSGSRSFLPT FSNVGVGLKA NVQGTLGGRQ TTTTGNNIPK 60 

WATLDQANLQ LWTGAGWRND KTTSGSTGNA NDTKFTSATG SGSGQGSSSG TNTSAGNPDG 12 0 

LQADKVDQNG QVKTSVQEAT SGDNLTNYTN LPPANLTPTA DWPNALSFTN KNNAQRAQLF 180 

LRGLLGSIPV LVNKSGQDDN SKFKAEDQKW SYTDLQSDQT KLNLPAYGEV NGLLNPALVE 240 

TYFGNTRASG SGSNTTSSPG IGFKIPEQSG TNTTSKAVLI TPGLAWTPQD VGNIWSGTS 3 00 

65 FSFQLGGWLV TFTDFIKPRA GYLGLQLTGL DVSEATQREL IWAKRPWAAF RGSWVNRLGR 3 60 
VESVWDFKGV WADQAQLAAQ AATSSTTTTA TGATLPEHPN ALAYQISYTD KDSYKASTQG , 420 

SGQTNSQNNS PYLHPIKPKK VBSXTQLDQG LKNLLDPNQV RTKLRQSFGT DHSTQPQPQS 480 
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LKTTTPVFGR SSGNLSSVFS GGC3AGG6SSG SGQSGVDLSP VERVSGH 527 
<212> Type : PRT 
<211> Length : 527 

SequenceNatne : SEQ ID 304 
5 SequenceDescription : 

Sequence 



<213> Organ! stnName : Mycoplasma pneumoniae 
10 <400> Pre Sequenc est ring : 

MLKIiAVGIFI SPTLTRFSTG FNXjAGSVLDQ 

AGSSSGTSTN TSGSRSFLPT PS2OTGVGLKA 

LWTGAGWRND KASNKQSDEN HTTFKSATGS 

TTQDGAPQSN STTESASNYD HLiX>PNLTPTS 
15 LVNRSGSDDS NKFQATDQKW SYTDLKSDQT 

SGSNTTSSPG IGFKIPEQNN DSKAVLITPG 

DFVKPRAGYL GLQLTGLDAS DATQRAIilWA 

QAQAAAQAAT TAAATGDALP EH£>NAIiAYQI 

KPKKVENTTQ LDQGLKTCWT PTRFAPSCAK 
20 VCLWGVLEE QTAPIRWTSP PLJSTGWVGGLW 

FFLNCSLTLF IWTTASLATG liTWGHFTST 

PWTYRNTSFS SLPLTGENPG AWALVRDNTA 

LRRYDLAGRC TTSTFRS 

<212> Type : PRT 
25 <211> Length : 737 

SequenceName : SEQ ID 305 
SequenceDescription : 

Sequence 

30 

<213> OrganismName : Mycoplasma pneumoniae 

<400> PreSequenceString r 

MLDYIPWXGN GHRYGNDHRG SRSSTSGVTT QGQQSQNASG TEPASTFSNV GVGLKANVQG 60 
TLGGSQTTTT GKDIPKWPTL DQANLQLWTG AGWRNDKASS GQSDENHTKF TSATGSGQQG 120 

35 SSSGTTNSAG NPDSLKQDKV DKSGDSVTVA ETTSGDNLTKT YTNLPPNLTP TADWPNALSF 180 
TNKNNAQRAQ LFLRALLGSI PVLVNKSGQD DSNRFQATDQ KWSYTELKSD QTKLNLPAYG 240 
EVNGLLNPAL VEVYGLSSTQ GS STGAGGAG GNTGGDTNTQ TYARPGIGFK LPSTDSESSK 3 00 

ATLITPGLAW TAQDVGNLW SaTSLSFQLG GWIiVTFTDFI KPRSGYLGLQ LTGLDAlsIDSD 3 60 

QRELXWAPPA LNRLSWQLGQ PI*GPRGECVG FQGGVGGSSS VRLASSYKYH HRKEGYLIGA 420 

40 HQCFGLSGEL YRPGFVQGFH SKIiRPKPKHL PLPALGAGEK SRFLW 465 
<212> Type : PRT 
<211> Length : 465 

SequenceName : SEQ ID 306 
SequenceDescription : 

45 

Sec[uence 



<213> Organ! smNarae : Mycoplasma pneumoniae 
<40 0> PreSequenceString : 

50 MLGSIPVLVN RSGSDSNKFQ ATIDQKWSYTD LQSDQTKLNL SAYGEVNGLL NPALVETYFG 60 

TTRTSSTANQ NSTTVPGIGF KXPEQNNDSK ATLITPGLAW TPQDVGNLW SGTTVSFQLG 12 0 

GWLVTFTDFV KPRAGYLGLQ LSGLNASDSD QRELIWAPRP WAAFRGSWVN RLGRVESVWD 180 

LKGWADQAQ LAAQAATSST TTTATGATLP EHPNALAYQI SYTDKDSYKA STQGSGQTNS 24 0 

QNNSLYLHLI KPKKVESTTQ LDQGLKNLLD PNQVRTKLRQ SFGTDHSTQP QPQSLKTTTP 3 00 

55 VFGAMSGNLG SVLSGGGA6G AGSTNSVDLS PVERVSGSLT INRNFSY 347 



<212> Type : PRT 

<211> Length : 347 

SequenceName : SEQ ID 307 
SequenceDescription : 

60 

Sequence 



<213> Organ! sraName : Mycoplasma pneumoniae 
<400> PreSequenceString : 
65 MGQQGQSGTS AGNPDSLKQD KXSKSGDSLT TQDGNATGQQ EATNYTNLPP NLTPTADWPM 60 
ALSFTNKNNA HRAQLFLRGL LGSIPVLWR SGSDSNKFQA TDQKWSYTDL QSDQTKLNLP 120 
AYGEVNGLLN PALVETYFGN TRAGGSGSNT TSSPGIGFKI PEQISJUDSKAT LITPGLAWTP 180 



VLDYVPWIGN GHRYGNNHRG 

NVQGTLGGSQ TTTTGKDIPK 

GQQGGSTTGG SAGNPDSLKQ 

DWPFALSFTN KNNAQRAQLF 

KLNLPAYGEV NGLLNPALVIE 

LAWTPQDVGN LWSGTSLSF 

KRPWAAFRGS WVNRLGRVES 

SSTDKDSYKA STQSSGQTNS 

ALVQTIPPKP NPNPSKQPHR 

GNYPVGVGGI WRILKVCKT 

TTTLKRQQFS YTRPDEVALR 

KGITAGSGSQ QTTYDPTRTE 



VDDITAPKTG , 60 

WPTLDPANLQ 120 

DKISKSGQNL 180 
LRGLLGSIPV , . 240 

TYFGTTRAGG 3 00 

QLGGWLVTFT 3 60 

VWDLKGVWQD 42 0 

QNTSPYLHLI 480 

CLGRIWTLA 540 

LLFISIFISI 600 

HTNAINPRLT 660 

AALTTATTFV 720 
737 



wo 2005/076010 



96/341 



PCT/IN2005/000037 



QDVGNLWSG TSLSFQLGGW LVSFTDFIKP RAGYLGLQLS GLDASDSDQR ELIWAKRPWA 240 

AFRGSWVNRIi GRVESVWDLK GVWADQAQLA AQAATSEASG SALAPHPNAL AFQVSWEAS 3 00 

AYSSSTSSSG SGSSSNTSPY LHLIKPKiCVE STTQLDQGLK NLLDPNQVRT KLRQSFGTDH 3 60 

STQPQSLKTT TPVFGTSSGN IGSVLiSGGGA GGGSSGSGQS GVDLSPVERV SGH 413 

<2±2> Type : PRT 
<211> Length : 413 

SequenceName : SEQ ID 308 

SequenceDescription : 

Seqjuence 



<213> OrganisrtiName : Mycoplasma pneumoniae 

'?;40 0> PreSequenceString 
15 MGIiQLSGLDA SDSDQRELIW AKRPWAAFRG SWVNRLGRVE SVWDLKGVWA DQAHSAVSES 60 
QAATSSTTTT ATGDTLPEHP NALAYQISST DKDSYKASTQ GSGQTNSQNT SPYLHLIKPK 120 
KVTASDKLDD DLKNLLDPNE VRVKLRQSFG TDHSTQPQPQ PLKTTTPVPG TNSGNLGSVL 180 
SGGGTTQDSS TTNQLSPVQR VSGWLVGQLP STSDGNTSST NNLAPNTNT6 NEWGVGDLS 240 
KRASIESSRL WIALKP 256 
20 <212> Type : PRT 

<2X1> Length : 256 

SequenceName : SEQ ID 309 

SequenceDescription : 

25 Sequence 



<213> Organi stnName : Mycoplasma pneumoniae 

<40 0> PreSequenceString : 

MRDNTAKGIT AGSGSQQTTY DPARTEATLT TTTFALRRYD LAGRALYDLD FSKLNPQTPT 60 
30 RDANCQITFN- PFGGFGLSGS APQQVJNEVKN KVPVEVAQDP TDPYRFAVLL VPRSWYYEQ 120 
LQRGLALPNQ GSSSGSGQQN TTXGAYGLKV KNAEADTAKS MEKLQGDESK SSNGSSSTST 180 
TTQRGSTNSD TKVKALKIEV KKKSDSEDNG QLQLEKNDLA NAPIKRGEES GQSVQLKADD 240 
FGTAPSSSGS GGNSNPGSPT PWRPWLATEQ IHKDLPKWSA SILILYDAPY ARNRTAIDRV 3 00 

DHLDPKVMTA NYPPSWRMPK WNHHGLWDWK ARDVLFQTTG FDESHTSNTK QGFQKEADSD 3 60 

35 KSAPIALPFE AYFANXGNLT WFGQALLVFS GNGHVTKSAH TAPLSIWLYX YLVKAVTFRL 420 
LLANSLLSKS NXYKKTAN " 438 

<212> Type : PRT 
<21X> Length r 438 

Sequencelsrame z SEQ ID 310 
40 Seq^uenceDescription r 



Sequence 



<213> OrganismName : Mycoplasma pneumoniae 

45 <40 0> PreSequenceString : 

MRDNIAKGIT AGSNTQQTTY DPTRTEATLT TATTFALRRY DLAGRALYDL DFSKLNPQTP 60 
TRDQTGQITF NPFGGFGLSG AAPQQmEVK DKVPVEVAQD PSNPYRFAVL LVPRSWYYE 12 0 

QLQRGLALPN QGSSSGSGQQ NTTIGAYGLK VKNAEADTAK SNEKLQGYES KSSNGSSSTS 180 
TTQRGGSSNE NKVKALQVAV KKKSGSQGNS GDQGTEQVEL ESNDLANAPI KRGSNNNQQV 240 

50 QLKADDFGTA PSSSGSGTQD GTPTPWTPWL TTEQIHNDPA KFAASILILY DAPYARNRTA 3 00 

IDRVDHLDPK VMTANYPPSW RTPKWNHHGL WDWKARDVLL QTTGFFNPRR HPBWFDGGQT 360 
VADNEKTGFD VDNSENTKQG FQKEADSDKS APIALPFEAY FANIGNLTWF EQALLVFGIC 420 
LS 422 
<212> Type : PRT 

55 <211> Length : 422 

SequenceName : SEQ ID 311 
SequenceDescription : 



Sequence 

60 

<213> OrganismName : Mycoplasma pneumoniae 
<40 0> PreSequenceString : 

MLWPFRWVWW KRVLTSQTRA PAKPNPLTVP PTCTWWSLRK LPNPTKLDDD LKNLLDPNEV 60 
RARMLKSFGT ENFTQPQPQP QALKTTTPVP GTSSGNLGSV LSGGGYHAGL KHHQSTVTRS 120 
65 TGEWVDR 127 
<212> Type : PRT 
<:211> Length : 127 
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SequenceNarae : SEQ ID 312 
SequenceDescription : 



Sequence 

<213> OrganismName : Mycoplasma pneumoniae 
<4O0> PreSequenceString : 

MRDNSAKGIT AGSESQQTTY DPTRTEAALT ASTTFALRRY DIjAGRALYDL DFSRLNPQTP 60 
TRDQTGQITF NPFGGFGLSG AAPQQWNEVK NKVPVEVAQD PSNPYRFAVL LVPRSWYYE 120 

10 QLQRGLALPN QGSSSGSGQQ NTTIQAYGLK VKNAEADTAK SNEKLQGDBS KSSNGSSSTS 180 
TTTQRGGSSG DTKVKZUIjQVA VKKKSGSQGKT SGEQGTEQVE LESNDLANAP IKRGEESGQS 240 
VQLKAADFGT TPSSSGSGGN SNPGSPTPWR PWLATEQIHK DLPKWSASIL ILYDAPYARN 3 00 

RTAIDRVDHIi DPKVMTANYP PSWRTPKWNH HGLWDWKARD VLLQTTGFFN SRRHPEWFDQ 360 
GQ?V:^rADNTQT GFDTDDTDNK KTRLSKGSWL RQAGPDEPPV WSVLRQHWQP HT-VPJ^SAPGV 42.0 

15 WDLFVLIIT 428 
<212> Type : PRT 
<211> Length : 428 

SequenceName : SEQ ID 313 
SequenceDescription : 

20 

Seq[uence 



<2X3> OrganismName : Mycoplasma pneumoniae 

<4O0> PreSequenceString : 
25 MFGIiKVKNAE ADTAKSNEKL QGAEATGSST TSGSGQSTQR GGSSGDTKVK ALQVAVKKKS 60 

GSQGNSGDQG TEQVELESND LANAPIKRGS NPASPTQGSR LRUHPIQFGI WSIRHPHPLK 120 

AVA.CDRA1TSQ GPPQMIRIjDP HSVRCALCIi 149 

<212> Type : PRT 

<2X1> Length : 149 
30 SequenceName : SEQ ID 314 

SequenceDescription : 



Sequence 



35 <213> OrganismName : Mycoplasma pneumoniae 
<40 0> PreSequenceString : 

MFGXiKVKDAT VDSSKQSTES LKGEESSSSS TTSSTSTTQR GGSSGDTKVK ALQVAVKKKS 60 
DSEDMGQIEL ETNNLANAPI KRGSNNNQQV QLKADDFGTS PSSSESGQSG TPTPWTPWLA 12 0 

TEQIHKDLPK WSASILILYD APYAKNRTAI DRVDHLDPKV MTAiJYPPSWR TPKWNHHGLW 18 0 

40 DWKARDVLVQ TTGFFNPRRH PDWFDQGQAV AENTQTGFDT DDTDNKKQGF RKQGEQSPAP 240 
lALPFEAYFA NIGNLTWFGQ ALLVFGICLS 270 
<212> Type : PRT 
<21X> Length : 270 

SequenceName : SEQ ID 315 

45 SequenceDescription : 

Sequence 



<213> OrganismName : Mycoplasma pneumoniae 

50 <40 0> PreSequenceString : 

MGSQNQGSTT TTSAGNPDSL VTDKVDQKGQ VQTSGQNLSD TNYTNLSPNF TPTSDWPNAL 60 
SFTNKNNAQR AQLFLHGLLG SIPVLVNKSG ENNEKFQATD QKWSYTELKS DQTKLNLPAY 120 
GEVISTGLLNPA LVETYPGTTR TSSTANQNST TVPGIGFKIP EQNTOSKAVL ITPGLAWTPQ 18 0 

DVGNLWSGT SFSFQLGGWL VSFTDFVKPR AGYLGLQLTG LDASDATQRA LIWAPPALSG 240 

55 LSWQLGQPVG PRGECVGPEG GVGGSSSVRL ARIYHHRNRG YLTGAPECFG LSGECGGSEC 3 00 

LQAKHELRPN PIH 313 
<23-2> Type : PRT 
<211> Length : 313 

SequenceName : SEQ ID 316 

60 SequenceDescription : 



Secjuence 



<213> OrganismName : Mycoplasma pneumoniae 
65 <400> PreSequenceString : 

MSFGLVGTVN NNGWKSPFRH ETKYRAGYDK FKYYKTHYRG AKKAGTNDDR WRWTAWFDLD 60 
FAHQKIVLIE RGELHRQADL KKSDPATNET SKTVWGSIKE KLLQNVNNLH SEKGVFLWFR 120 
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QSGFTTTRN 

<212> Type : PRT 

<211> Length : 129 

SequenceKfarae : SEQ ID 317 
5 SequenceDescription : 

Sec[uence 



<213> OrganistnHame : Mycobacterium tuberculosis H3 7Rv 

10 <400> PreSequenceString : 

MAEPIiAVDPT GLSAAAAKLA GIjVFPQPPAP IAVSGTDSW AAINETMPSI ESLVSDGLPG 60 

VKAALTRTAS NMNAAADVYA KTDQSLGTSL SQYAFGSSGE GLAGVASVGG QPSQATQLLS 120 

TPVSQVTTQIj GETAAELAPR WATVPQIiVQ LAPHAVQMSQ NASPXAQTIS QTAQQAAQSA 180 
QGGSGi^?MP>.Q LASAEKPATE QAEPVHEVTN DDOGDQGDVQ PAEWAAARD - -^^^QQ - 240.. 

15 PGGGVPAQAM DTQAGARPAA SPIiAAPVDPS TPAPSTTTTL 280 



<212> Type : PRT 

<211> Length : 280 

SequenceKfarae : SEQ ID 318 
SequenceDescription : 

20 

Sequence 



<213> OrganisitiKTame : Mycobacterium tuberculosis H3 7Rv 
<400> PreSequenceString : 

25 MRYLIATAVL VAWLVGWPA AGAPPSCAGL GGTVQAGQIC HVHASGPKYM LDMTFPVDYP 60 
DQQALTDYIT QNRDGFVNVA QGSPLRDQPY QMDATSEQHS SGQPPQATRS WLKFFQDLG 120 
GAHPSTWYKA FNYNLATSQP ITFDTLFVP6 TTPLDSIYPI VQRELARQTG FGAAILPSTG 180 
LDPAHYQNFA ITDDSLIFYF AQGELLPSFV QACQAQVPRS AIPPLAI 227 
<212> Type : PRT 

30 <211> Length r 227 

SequenceEXame : SEQ ID 319 
SequenceDescription : 

Secpience 
35 

<213> OrgsLnisTtiirame : Mycobacterium tuberculosis H37Rv 
<400> PreSequenceString : 

MKMVKSIAAG LTAAAAIGAA AAGVTSIMAG GPWYQMQPV VFGAPLPLDP ASAPDVPTAA 60 
QLTSLLNSLA DPNVSFANKG SLVEGGIGGT EARIADHKLK KAABHGDLPL SFSVTKTIQPA 120 
40 AA6SATADVS VSGPKLSSPV TQNVTFVJSTQG GWMLSRASAM ELLQAAGKT 168 
<212> Type : PRT 
<211> Length : 168 

SequenceName : SEQ ID 320 

SequenceDescription : 

45 

Sequence 



<213> OrganismlTame : Mycobacterium tuberculosis H3 7Rv 
<400> PreSequenceString : 

50 MTYSPGNPGY PQAQPAGSYG GVTPSFAHAD EGASKLPMYL NIAVAVLGLA AYFASFGPMF 60 

TLSTELGGGD GAVSGDTGLP VGVALLAALL AGVALVPKAK SHVTWAVLG VLGVFLMVSA 120 

TFNKPSAYST GWALWWLAF IVFQAVAAVL ALLVETGAIT APAPRPKFDP YGQYGRYGQY 180 

GQYGVQPGGY YGQQ6AQQAA GLQSPGPQQS PQPPGYGSQY GGYSSSPSQS GSGYTAQPPA 240 

QPPAQSGSQQ SHQGPSTPPT GFPSFSPPPP VSAGTGSQAG SAPVNYSNPS GGEQSSSPGG 3 00 

55 APV 303 



<212> Type : "PRT 

<211> Length : 303 

SequenceName : SEQ ID 321 
SequenceDescription : 

60 

Sequence 



<213> OrganismKTame : Mycobacterium tuberculosis H37Rv 
<400> PreSequenceString : 
65 MKCPGVSDCV ATVRHDNVFA lAAGLRWSAA VPPLHKGDAV TKLLVGAIAG GMLACAAILG 60 
DGIASADTAL IVPGTAPSPY GPLRSLYHFN PAMQPQIGAN YYNPTATRHV VSYPGSFWPV 120 
TGLNSPTVGS SVSAGTNNLD AAIRSTDGPI FVAGLSQGTL VLDREQARLA NDPTAPPPGQ 180 
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LTPIKAGDPN NLLWRAFRPG THVPIIDYTV PAPAESQYDT INIVGQYDIF SDPPNRPGNL 240 
LADLNAIAAG GYYGHSATAF SDPARVAPRD ivri'TNSLGA TTTTYFIRTD QLPLVRALVD 3 00 

MAGLPPQAAG TVDAALRPII DRAYQPGPAP AVNPRDLVQG IRGIPAIAPA lAIPIGSTTG 3 60 

ASAATSTAAA TAAATNALRG ANVGPGANKA LSMVR6LLPK GKKH 404 
5 <212> Type : PRT 

<211> Length : 404 

SequenceName : SEQ ID 322 

SeguenceDescription : 

10 Sequence 



<213> OrganisniKTatne : Mycobacterium tuberculosis H37Rv 
<40 0> PreSequenceString : 

M.QT.T.z-TT • T,pppFDAIPN -PIEDIOTLV/i ^H-rA-r^.j.r.qT- rr.r^7\a,QLGEI -^tjirr.-r... ^r. ^0 

15 KAPHCPAKSD QTPAGAAGDG DLPEVGGRVT SPli'UPPVAAL TGYSANIGGL SVPHSWIMLPP 12 0 

AVRQVAAMFP GATPMYMTGS SDGSYAGLAA AGLAGTGLAG LAARGGSAPT PAAAAPAGAG 180 
GAGPAATRPA AQQTPAVPAA AAGSAIPGLP PGLPPGWAN LAATIoAAIPG ATIIWPPSP 24 0 

NANQ 244 
<212> Type : PRT 
20 <211> Length : 244 

SequenceName : SEQ ID 323 
SequenceDescription : 

Sequence 
25 

<213> OrganismName : Mycobacterium tuberculosis H37Rv 
<400> PreSequenceString : 

MDVALGVAVT DRVARLALVD SAAPGTVIDQ FVLDVAEHPV EVLTETWGT DRSIiAGEKHR 60 

LVATRLCWPD QAKADELQHA LQDSGVHDVA VISEAQAATA LVGAAHAGSA VLLVGDETAT 120 

30 LSWGDPDAP PTMVAVAPVA GADATSTVDT LMARLGDQAL APGDVFLVGR SAEHTTVLAD 18 0 

QLRAASTMRV QTPDDPTFAL ARGAAMAAGA ATMAHPALVA DATTSLPRAE AGQSGSEGEQ 240 

LAYSQASDYE LLPVDEYEEH DEYGAAADRS APLSRRSLLI GNAWAFAVI GFASLAVAVA 3 00 

VTIRPTAAirK PVEGHQMAQP GKFMPLLPTQ QQAPVx'PPPP DDPTAGFQGG TIPAVQfTWP 3 60 

RPGTSPGVGG TPASPAPEAP AVPGWPAPV PIPVPX CIPP FPGWQPGMPT IPTAPPTTPV 42 0 

35 TTSATTPPTT PPTTPVTTPP TTPPTTPVTT PPTTPPTTPV TTPPTTVAPT TVAPTTVAPT 480 

TVAPTxVAPA TATPTTVAPQ PTQQPTQQPT QQMPXQCiQTV APOTVAPA^JQ PPSGGRNGSG 54 C 

GGDLFGGF 548 
<212> Type r PRT 
<211> Length r 548 

40 SequenceName : SEQ ID 324 

SequenceDescription : 

Sequence 



45 <213> OrganismName : Mycobacterium tuberculosis H3 7Rv 

<400> PreSequenceString : 

MKNARTTLIA AAIAGTLVTT SPAGIANADD AGLDPNAAAG PDAVGFDPNL PPAPDAAPVD 60 
TPPAPEDAGF DPNLPPPLAP DFLSPPAEEA PPVPVAYSVN WDAIAQCESG GNWSINTGNG 120 
YYGGLRFTAG TWRANGGSGS AAKASREEQI RVAENVLRSQ GIRAWPVCGR RG 172 

50 

<212> Type : PRT 
<211> Length : 172 

SequenceName : SEQ ID 325 

SequenceDescription : 

55 

Sequence 



<213> OrganismName : Mycobacterium tuberculosis H37Rv 
<400> PreSequenceString : 
60 MTRLIPGCTL VGLMLTLLPA PTSAAGSNTA TTLFPVDEVT QLETHTFLDC HPNGSCDFVA 60 
GANLRTPDGP T6FPPGLWAR QTTEIRSTNR LAYLDAHATS QFERVMKAGG SDVITTVYPG 120 
EGPPDKYQTT GVIDSTNWST GQPMTDVNVI VCTHMQWYP GVNLTSPSTC AQANFS 176 

<212> Type : PRT 
65 <211> Length : 176 

SequenceName : SEQ ID 326 
SequenceDescription : 
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Sequence 



<213> OrganistnName : Mycobacterium tuberculosis H3 7Rv 
<400> PreSequenceString : 

MTPGLLTTAG AGRPRDRCAR IVCTVFIETA WATMFVALIi GLSTISSKAD DIDWDAIAQC 60 
ESGGNWAANT GNGLYGGLQI SQATWDSNGG VGSPAAASPQ QQIEVADNIM KTQGPGAWPK 12 0 

CSSCSQGDAP LGSLTHILTP LAAETGGCSG SRDD 154 
<212> Type : PRT 
<211> Length : 154 

SequenceName : SEQ ID 327 

SequenceDescription : 



Sequence 



<213> OrganisraName : Mycobacterium tuberculosis H37Rv 
<400> PreSequenceString : 

MMQQAVSGIT GALGGAVGGV MGPLTQLPQQ AMQAGQGAMQ PLMSALQQTY GAEGLDVADG 60 
ARLVDSIEGE PGLGGEPGAG DVGAGGGGGG TTPTGYLGPP PVPTSSPPTT PAGAPAKSVT 12 0 

PDPVSGTPRA SGPAGMTGMP MVPPGAL6AG AEGANKDKPV EKRVTGCAEW STGQGPLNST 180 
AECSGEICRR QAGGHQVDAT DPCCAERRQG 210 
<212> Type : PRT 
<211> Length : 210 

SecjuenceNarae : SEQ ID 328 

SequenceDescription : 



Sequence 



<213> OrganismName : Mycobacterium tuberculosis H37Rv 

<4 00> PreSequenceString : 

MIRELVTTAA ITGAAIGGAP VAGADPQRYD GDVPGMNYDA SLGAPCSSWE RFIFGRGPSG 60 
QAEACHFPPP NQFPPAETGY WVXSYPLYGV QQVGAPCPKP QAAAQSPDGL PMLCLGARGW 120 
QPGWFTGAGF FPPEP 135 
<212> Type : PRT 
<211> Length : 13 5 

SequenceName : SEQ ID 329 

SequenceDescription : 



Sequence 



<213> OrganisraName : Mycobacterium tuberculosis H37Rv 
<40 0> PreSequenceString : 

MKTTGTTIKL GIVWLVLSVF TVMIIWFGQ VRFHHTTGYS AVFTHVSGLR AGQPVRAAGV 60 
EVGKVAKVTL IDGDKQVLVD FTVDRSLSLD QATTASIRYL NLIGDRYLEL GRGHSGQRLA 120 
PGATIPLEHT HPALDLDALL GGFRPLFQTL DPDKVNSIAS SIITVFQGQG ATINDILDQT 180 
ASLTATLADR DHAIGEWNN LNTVLATTVK HQTEFDRTVD KLEVLITGLK NRADPLAAAA 240 
AHISSAAGTL ADLLGRIVHC CTAASGTSRA SSSRS 275 
<212> Type : PRT 
<211> Length : 275 

SequenceName : SEQ ID 330 

SequenceDescription : 



Sequence 



<213> OrganismName : Mycobacterium tiaberculosis H37Rv 
<40 0> PreSequenceString : 

MTPRSLVRIV GVWATTLAL VSAPAGGRAA HADPCSDIAV VFARGTHQAS GLGDVGEAFV 60 
DSLTSQVGGR SIGVYAVNYP ASDDYRASAS NGSDDASAHI QRTVASCPNT RIVLGGYSQG 120 
ATVIDLSTSA MPPAVADHVA AVALFGEPSS GFSSMLWGGG SLPTIGPLYS SKTINLCAPD 180 
DPICTGGGNI MAHVSYVQSG MTSQAATFAA NRLDHAG 217 
<212> Type : PRT 
<211> Length : 217 

SequenceName : SEQ ID 331 

SequenceDescription : 



Sequence 
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<213> OrganismName : Mycobacterium tuberculosis H37Rv 
<400> PreSeguenceString : 

MISTTRIDPL WILSVAFASM lALATLLTLI NQWGTPYIP GGDSPAGTDC SEIjASWVSNA 60 

ATARPVFGDR FNTGNEEAAL AARGFQQGTA PKIAIiVIGWNG HHTAVTLPDG TPVSSGEGGG 120 

5 VRVGGGGAYQ PKFTHHMYLP MDVDAGEDQP PAPDEPVTAV DDVEPEMPAP CPTQRPPVTP 180 

RHNLCNKIiRT MPGALSAALA AAAPVWPAPI SGCRGFSTSL IiAKRNHPVIV GK 232 



<212> Type : PRT 
<211> Length : 232 
10 SequenceName : SEQ ID 332 

SequenceDescription : 

Sequence 



15 <213> OrganismName : Mycobacterium tuberculosis H37Rv 
<400> PreSequenc est ring : 

MTTMITLRRR PAVAVAGVAT AAATTVTLAP APANAADVYG AIAYSGNGSW GRSWDYPTRA 60 
AAEATAVKSC GYSDCKVLTS FTACGAVAAN DRAYQGGVGP TliAAAMKDAL TKLGGGYIDT 120 
WACN 124 
20 <212> Type : PRT 

<211> Length : 124 

SequenceName : SEQ ID 333 

SequenceDescription : 

25 Sequence 

<213> OrganismName : Mycobacterium tuberculosis H37Rv 
<400> PreSequenceString : 

MAGLNIYVRR WRTALHATVS ALIVAILGLA ITPVASAATA RATLSVTSTW QTGFIARFTI 60 
30 TNSSTAPLTD WKLEFDLPAG ESVLHTWNST VARSGTHYVL SPANWNRIIA PGGSATGGLR 120 

GGLTGSYSPP SSCLLNGQYP CT 142 

<212> Type t PRT 

<211> Length r 142 

SeqpxenceName : SEQ ID 334 
35 SequenceDescription r 

Sequence 



<213> OrganismName : Mycobacterium ttaberculosis H37Rv- 

40 <400> PreSequenceString t 

MLTRAIKTQL VLLTVLAVIA VWLGWYFLR IPSLVGIGRY TLYAELPRSG GLYRTANVTY 60 
RGITIGKVTG VEPTERGARA TMSIDNGYQI PTDASANVHS VSAVGEQFVD LVSTRTSGPY 12 0 

LRHGQTITTT TVPSQIGPAL DAANRGLAVL PKDRVASVLH EASEAVGGLG SSLNRLIEAT 180 
QAIAHDVRGS LEDXDDIIER SAPIIDSQVN SGNEIARWAA NLNTLAAQTA QTDPAVRSIL 240 

45 ANAAPTADQV NATPSDVRES LPQTLANLEV VIDMLKRYHN GVEQALVFLP QSGAIAQSVT 300 
TEFPGQAGLG VGGLALNQPP PCLTGFLPAS EWRSPADTST APLPKGTYCR IPMDASNWR 360 
GARNNPCVDV PGKRAATPRE CRSNEAYVPG GTNPWYGDPN QMLSCPAPAA RCDQPVKPGQ 42 0 

VIPAPSVNNG INPLPADQLP GTPPPVNDPL QRPGSGTVQC NGQQPNPCVY TPSTFPTTIY 480 
DVQSGKWAP DGWYSVBAS THAGADGWKV MLAPTG 516 

50 <212> Type : PRT 

<211> Length : 516 

SequenceName : SEQ ID 335 
SequenceDescription : 

55 Sequence 



<213> OrganismName : Rickettsia prowazekii 
<400> PreSequenceString : 

MLNNTQFLNL MKSYMKPEFY MSSIKNTTNL DLSSITNTIQ KAMNIFFTTN KISTESMQSI, 60 
60 FKKNSEIIQN NINTILNSTK EVINSKDFKQ ATEYHQKCVK SIYETSMDNA KELANIAYEA 120 

SNKIFEAANK HITKNIHNAS NNIHNTAEQV QKNFNNKSA 159 

<212> Type : PRT 

<211> Length : 159 

SequenceName : SEQ ID 336 
65 SequenceDescription : 

Sequence 
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<213> OrganisrtiName : Rickettsia prowazekii 
<400> PreSequenceString : 

MNIKLVTYPIj ILVSSLKVNA DLNHIQDSFK YQEAEQIiTIE LPWNDCTAIH KFIiEEKLFFS 60 
5 EQQIKKENKI HEKYKQEYLQ HNNKLSDPSM QFIiEKKSEIKT SVETIiISSFL KFCEDNFQTS 120 
KSKSHSIiNFP QKQQDQWLHN IRNENYKTYY KKKYEDNTFR NIN 163 
<212> Type : PRT 
<211> Length : 163 

SequenceNarae : SEQ ID 337 
10 SequenceDe script ion : 

Sequence 



<213> OrganismNanie Rickettsia prowazekii 
15 <400> PreSequenceString : 

MKKLLIiIATA SATIIiSSSVS FAECIDNEWY LRADAGVAMF NKEQDKATGV KLKSNKAIPI 60 
DLGIGYYISE NVRADLTLGT TIGGKLKKYG AATNTHFTGT NVSVSHKPTV TRLLINGYVD 12 0 

LTSFDMPDVF VGGGVGPALV KEKISGVSGL ASNTKNKTNV SYKLIFGTSA QIADGVKVEL 180 
AYSWINDGKT KTHNVMYKGA SVQTGGMRYQ SHNLTVGVRF GI 222 
20 <212> Type : PRT 

<211> Length : 222 

SequenceName -. SEQ ID 33 8 

SequenceDescription : 

25 Sequence 



<213> OrganismCTame : Rickettsia prowazekii 
<400> PreSequenceString : 

MKKNMRKQML KIISIIIISL LLSSCSESTR DENGLLTDSQ STIIRDYIIS QNSKKTLKVNL 60 

30 KEKFGSNLKG VKLIGIKLTN EDLSGIDFTS CEILRTDFMG SNLEKAILTN SVIQESMFAD 12 0 

SVIPCWISGYN ADFQGSIFWKr ITLQNTNFVQ SNFSDTAFNK STIINVNFEN SPCFSNVLWCH 180 

SNIDSSNFQK THLKNHSFKN TNVMNSIFYG ADLGKSVINN TNFTlJJNYFES SDLSNTKFTS 240 

VTIKDSNFTQ SIFWSVNFWBr IQSNNSFFSY TSFEDSTLHN IHLTKCDLQN STINSSVFlsJN 300 

FKIDNAXLTN MSLNDNTFNN LSIKNSNTNF VRINKSKGFN ITLLNTNYSN AIFSNNDLKE 360 

35 FKVINTDLNN SEIINSNFTN" GQFIWsiWFSQ SLIQNVNFTD VKITLGNLNQ VALINSMLIN 42 0 

TNIIMSVLSN SQINNINYQA YYSFINTNVS NNIVINDNSM QIPPNNIVIN SEKDLQNISN 480 

LANMNLTNFN LSNLVFNGVD FSKSIFKKAN LTNTVIKNSI LKDAtJFSAAI LTKTDFSKSI 540 
LTGSIFKFAQ IDQTCFSNSD LTNTDFTEAT IKNTAFDNAN THGIKGLE " ' 588 
<212> Type : PRT 
40 <211> Length z 588 

SequenceNarae : SEQ ID 339 
SequenceDescription : 

Sequence 
45 

<213> OrganisTtiName : Porphyromonas gingival is WB3 
<400> PreSequenceString : 

MIQKFTNVKL NDMRKILSFL MMCSLHLGLQ SQTWH6DPDS VAALPSIGIQ ESSCTRITFE 60 

WFPGFYSVE KREGNQVFQR ISMPGCGSPG NLGEAELPVL KKMIAVPEFS TANVAVKIKE 12 0 

50 TETFDNYNIY PNPTYWEEL PEGGTYLVEA PAINNDYYSQ NVSLPSTHYV YSQDGYFRSQ 18 0 

RFIEVTLYPF RYNPVRQEIL FAKKIEVTIT FDNPQPPLQK NTGIFNKVAS SAFINYEADG 24 0 

KSAIENDMVF SRGTTTYISG NVASNLPQNC DYLVIYDDMF NVNQQPHDEI KRLCEHRAFY 3 00 

NGFDVAAVSI KDVLNSPPSKT ATSYINETKL KNFIRSVYNQ SNAKRTLDGK LGYVLLIGKP 3 60 

LSKYLADTDN TKVPTSFIHKT VSLIPSHPTF GSICASDYFF SCVSPLDTVG DLFIGRFSVT 420 

55 NAHELHNLIE KTINKEISYN PIAHKNILYA EGKGCDAPIL RLFLKEIASG YTVNSILKSN 480 

QVSAIDSIPD CLNNGSHHFY FNTHGMPTVW GIGQGLDWT LTARLNNTSS QGLCTSLSCS 54 0 

SAVADSTIRS L6EVLTTYAP NKGFSAFLGG SRATQYAVYL EGPCPPSEFY EYLPYSLYHN 60 0 

LSTWGEMLL SSIINTNSVD TYSKFNFNLL GDPALNIMAH GMEVSNCITL PNNTIISSPI 660 

TIKNGGCLKI PBKGVLHFTN NGSIQVMSGG TLEIGNQAKI SGETGANPTF ITVYGDGLAI 72 0 

60 NKQVEIDNID RLNLFSTHSV MPKPHFDSVK FNSAPLYTTN CIVEISNCEF TNRSDIISKN 780 

CDLSVENSMF SSS6ITVFKP MATSSITGLS TKAKITDNTF FATGNFAYHI TNTPGLTATS 840 

NAAIKLDNIP EYYISGISTKIV NCDEALVLNN SGNRTNRLHN ITRNVIKKTCR IGSTLYNSYG 900 

lYNRNKISNN HIGVRLLNNS CFYFDNAPVI NEEDKQTFIS NRTWQLYSSN GTFPLNFHYN 960 

SLQGGDTDTW lYNDTYTNRY IDVSNNHWGN NDLFDPNQVF NTPDLFXWIP FWDGLPNGRS 1020 

65 GNSSAEAVEF QTALDCIGNS DYLSAKVALK MMVETYPESD PAIAALKELF RIEKMSGNDY 1080 

EGLKDYPRSN PTIISSQNLF PTADPLSARC DIVCENYQSA IDWYENRLNS EISYQDSVFA 1140 

VIDLGDIYWKT MQLDSLRGTG IDLNILSCEQ RKSLESHQNV KNYLLSTLPE STGTLLPPLE 1200 
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CNKSSLDKSK IISISPNPAK AWTIIYYXD NPSCSVIKIY GINGASADIT GLPKHLSEGY 1260 
ySIQFNTSNF DPGFYLVTLN VDQKIIDTEK LRIK 1294 
<212> Type : PRT 
<211> Length : 1294 
5 SequenceKTame : SEQ ID 340 

SequenceDescription : 

Sequence 



10 <213> OrganisraName : Shigella flexneri 2a str. 2457T 
<400> PreSequenceString : 

MQGKNTIVTT GDYSIGLLSQ TSGNLNTDTI IRVNSDGSVT PSFSDGDDTF IVTAGNHAVG 60 

VLACASPGSA CACVSSLDEE STADTGSNEN NAIAKLDMAK GEITTHGTES YAAYANGTW 120 

KAGDTLDYTN ASVTLTDVDI TTHGDNAHAI AARQGTVSFF QGEXYTTGPD AAIAKIYMGG 18 0 

15 TVTLKNTSAV AHQGSGIVLE SSINGQEATV DILSGSSLRS ANEILYHKDE TSNVTITDSE 24 0 

VSSAADVFIN NIKGHLTVDA TNSKITGSAN ISTDDNTHTY LSLSDNSTWD IKADSTVSNI. 3 00 

TVDNSTVYIS RADGRDVEPT RLTITENYVG NNGVLHLRTE LDDDNSATDK WINGNTSGT 3 60 

TRVKVTNAGG SGAYTLNGIE IISVEGESNG EFIKDSRIFA GAYEYSLTRG NTEATNKNWY 42 0 

IiTNFQATSGG ETNSGGSSAP TVAPTPVLRP EAGSYVANLA AANTLFVMRL NDRAGETRYI 480 

20 DPVTEQERSS RLWLRQIGGH NAWRDSNGQL RTTSHRYVSQ LGGDLLTGGF TDSDSWRLGV 540 

MAGYARDYNIi THSSVSDYRS KGSVRGYSAG LYATWFADDI SKKGAYIDSW AQYSWFKNSV 60 0 

KGDELAYESY SAKGATVSLE AGYGFALNKS FGLEAAKYTW IFQPQAQAIW MGVDHNAHTE 660 

ANGSRIENDA NNNIQTRLGF RTFIRTQEKN SGPHGDDFEP FVEMNWIHNS KDFAVSMKTGV 72 0 

KVEQDGVSNIi GEIKLGVNGN LNPAASVWGN VGVQLGDNGY NDTAVMVGLK YKF 773 



25 

<212> Type : PRT 

<211> Length : 773 

SequenceHame : SEQ ID 341 
SequenceDescription : 

30 

Sequence 



<213> OrganismName r Shigella flexneri 2a str. 2457T 
<400> PreSequenceString : 

35 MTKLKLLALG VLIATSAGVA HABGKFSL6A GVGWEHPYK DYDTDVYPVP VINYEGDNFW 60 
FRGLGGGYYL WNDATDKLSI TAYWSPLYFK AKDSGDHQMR HLDDRKSTMM AGLSYAHFTQ 12 0 

YGYLRTTLAG DTLDMSNGIV WDMAWLYRYT NGGLTVTPGI GVQWNSENQN EYYYGVSRKE 180 
SARSGLRGYM PMDSWSPYLE LSASYUTFLGD WSVYGTARYT RLSDEVTDSP MVDKSWTGLI 240 
STGITYKF 248 

40 <212> Type z PRT 

<211> Length = 248 

SecjuenceName : SEQ ID 342 
SequenceDescription : 

45 Sequence 



<213> OrganismName : Shigella flexneri 2a str. 2457T 

<400> PreSequenceString : 

MKKIALAGLA GMLLVSASVN AMSISGQAGK EYTNIGVGFG TESTGLALSG NWTHNDDDGD 60 
50 VAGVGLGLNL PLGPLMATVG GKGVYTNPNY GDEGYAAAVG GGLQWKIGNS FRLFGEYYYS 120 
PDSLSSGIQS YEEANAGARY TIMRPVSIBA GYRYLNLSGK DGNRDNAVAD GLYVGVNASF 180 

<212> Type : PRT 
<211> Length : 180 
55 SequenceName : SEQ ID 343 

SequenceDescription : 

Sequence 



60 <213> OrganismName : Shigella flexneri 2a str. 2457T 
<400> PreSequenceString : 

MTTLTARVFT TAEIIYRKTV lALVCHLNCS RQETVTMNKT IMALAIMMAS FAANASVLPE 60 

TPVPFKSGTG AIDNDTVYIG LGSAGTAWYK LDTQAKDKKW TALAAFPGGP REQATSAFID 120 

GNLYVFGGIG KNSEGLTQVF NDVHKYNPKT NSWVKLMSHA PMGMAGHVTF VHNGKAYVTG 18 0 

65 GVNQNIFNGY FEDLNEAGKD STAIDKINAH YFDKKAEDYF FNKFLLSFDP STQQWSYAGE 240 

SPWYGTAGAA WNKGDKTWL INGEAKPGLR TDAVFELDPT GNNLKWNKLD PVSSPDGVAG 300 

GFAGISNDSL IFA6GAGPKG SRENYQNGKN YAHEGLKKSY STDIHLWHNG KWDKSGELSQ 3 60 
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GRAYGVSLPW NNSLLIIGGE TAGGKAVTDS VLISVKDNKV TVQN 404 
<212> Type : PRT 
<211> Length : 404 

SequenceName : SEQ ID 344 

SequenceDescription : 

Sequence 



<213> OrganistnName : Shigella flexneri 2a str. 2457T 

<4 00> PreSequenceString : 

MA.TGGAALAG KAVMGAAAGA AGGASALQAA FQKASASMET GGDMSSMGSV VSSGGNGGGE 60 
AGTAGSSPFA QAAGFGDSGS SSSGGGFAKA AKLATGTASE IjAKGVGSQVK QGFQERVSET 120 
TGGKLAASIR ESMEPKEASQ SGQPEGNSLG ADSGPDSNEV RS 162 
<=212> Type : PRT 
<211> Length : 162 

' SequenceName : SEQ ID 345 
SequenceDescription : 

Sequence 



<213> OrganismName : Shigella flexneri 2a str. 2457T 
<400> PreSequenceString : 

MKRVLIPGVI LCGADVAQAV DDKNMYMYFF EEMTVYAPVP VPVNGNTHYT SESIERLPTG 60 

NGNISDLLRT NPAVRMDSTQ STSLNQGDIR PEKISIHGAS PYQNAYLIDG ISATNNLNPA 12 0 

NESDASSATN ISGMSQGYYL DVSLLDNVTL YDSFVPVEFG RFNGGVIDAK IKRFNADDSK 180 

VKLGYRTTRL DWLTSHIDEN NKSAFNQGSS GSTYFSPDFK KNFYTLSFNQ ELADNFGVTA 240 

GLSRRQSDIT RADYVSNDGI VAGRAQYKNV IDTALSKFTW FASDRFTHDL TLKYTGSSRD 300 

YNTSTFPQSD REMGNKSYGL AWDMDTQLAW AKLRTTVGWD HISDYTRHDH DIWYTELSCT 3 60 

YGDITGRCTR GGLGHISQAV DNYTFKTRLD WQKFAVGDVS HQPYFGAEYI YSDAWTERHN 420 

QSESYVINAA GKKTNHTIYH KGKGSLGIDN YTLYMADHIS WRNVSLMPGV RYDYDNYLSN 480 

HNISPRFMTE WDIFADQTSM ITAGYNRYYG GNILDMGLRD IRNSWTESVS GNKTLTRYQN 54 0 

LKTPYNDELA MGLQQKIDKN VIARASEAHD QISKSSRTDS ATKTTITEYN NDGKTKTHSF 600 

NLSFELAEPL HIRQVDINPQ IVFSYIKSKG NLSLNNGYEE SNTGDNQWY NGmVSYDSV 660 

PVADFlsINPLK ISLNMDFTHQ PSGLVWANTL AWQEARKARI ILGKTNAQYI SEYSDYKQYV 72 0 

DEKLDSSLTW DTRLSWTPQF LKQQNLTISA DILNVLDSKT AVDTTNTGVA TYASGRTPWL 780 

DVSMKF 786 
<212> Type : PRT 
<211> Length : 786 

SequenceName : SEQ ID 346 

SequenceDescription : 



Sequence 



<213> OrganisTtiName : Shigella flexneri 2a str. 2457T 
<400> PreSequenceString : 

MKKTLLAIML AGTAFASQAG TLVSQGTEAS ANLTLTKPIV VNNTIQPVKG VYSGTLTAWT 60 
PLATGIVGAS DGQSHDYAVT FPDDIYAESS TSADAVISGD NNPDHKLKVS LTTLEQDPPS 120 
AASEEIGGKR YMMLKNTGTG GAYRWSHMK EQWEPDSYT IRTQAYIYAE 170 
<212> Type : PRT 
<211> Length : 170 

SequenceName : SEQ ID 347 

SequenceDescription : 

Sequence 



<213> OrganistnName : Shigella flexneri 2a str. 2457T 
<40 0> PreSequenceString : 

MGIYHWSRKT KMKRTKSIRH ASFRKNWSAR HLTPVALAVA TVFMLAGCEK SDETVSLYQN 60 

ADDCSAANPG KSAECTTAYN NALKEAERTA PKYATREDCV AEFGEGQCQQ APAQAGMAPE 12 0 

NQAQAQQSSG SFWMPLMAGY MMGRLMGGGA GFAQQPLFSS KNPASPAYGK YTDATGKNYG 180 

AAQPGRTMTV PKTAMAPKPA TTTTVTRGGP GESVAKQSTM QRSATGTSSR SMG6 234 



<212> Type : PRT 

<211> Length : 234 

SequenceName : SEQ ID 348 
SequenceDescription : 
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Sequence 



<213> OrganismName : Shigella flexneri 2a str. 2457T 
<400> PreSequenceString : 
5 MTKMSRYALI TALAMFLAGC VGQREPAPVE EVKPAPEQPA EPQQPVPTVP SVPTIPQQPG 60 
PIEHEDRTAP PAPHIRHYDW NGAMQPMVSK MLGADGVTAG SVLIiVDSVNN RTNGSLNAAE 120 
ATETLRNALA NNGKFTLVSA QQLSMAKQQL GLSPQDSLGT RSKAIGIARN" VGAHYVLYSC 18 0 

ASGNVNAPTL QMQLMLVQTG EIIWSGKGAV SQQ 213 
<212> Type : PRT 
10 <211> Length : 213 

SequenceNarae : SEQ ID 349 

SequenceDescription : 

Sequence 
15 

<213> OrganismName : Shigella flexneri 2a str. 2457T 
<400> PreSequenceString : 

MTKLMQFVQR CYYMTNKPOVIY FILILtVFTLL QVCFFALWKA RDGSTTSLEC TSTLTRNAKT 60 
DHSLYYSANXi SVILKKDGSG SFTIVGLTDE DTPRKFSHSY FFTYKIDSNG RISGNAKAKV 120 
20 SGLENQIKDE NFRLNFLDAS LTGKGNARLS KFNlSJVYIFSI PGLIINTCAP I 171 



<212> Type : PRT 
<211> Length : 171 

SequenceName : SEQ ID 350 
25 SequenceDescription : 

Sequence 



<213> OrganismName : Shigella flexneri 2a str. 2457T 
30 <40 0> PreSequenceString : 

MGRISSGGMM FKAITTVAAL VIATSAMAQD DLTISSLAKG ETTKAAFNQM VQGHKLPAWV 60 
MKGGTYTPAQ TVTLGDETYQ VMSACKPHDC GSQRIAVMWS EKSNQMTGLF SAIDEKTSQE 120 
KLTWLNVNDA LSXDGKTVLF AALTGSLENH PDGFNFK 157 
<212> Type : PRT 
35 <211> Length : 157 

SequenceName t SEQ ID 351 
SequenceDescription : 



Sequence 

40 

<213> OrganismName : Streptococcus mutans UA159 
<400> PreSequenceString : 

MKKQFLEKAV FTVAATAATV VLGNKMADAD TYTLQEGDSF FSVAQRYHMD AYELASMNGK 60 

DITSLILPGQ TLTVNGSAAP DNQAAAPTDT TQATTETNDA NANTYPVGQC TWGVKAVATW 120 
45 AGDWWGNGGD WASSASAQGY TVGNTPAVGS IMCWTDGGYG HVAYVTAVGE DGKVQVLESN 180 

YKDQQWVDNY RGWFDPNNSG TPGSVSYIYP N 211 

<212> Type : PRT 

<211> Length : 211 

SequenceName : SEQ ID 352 
50 SequenceDescription : 



Sequence 



<213> OrganismName : Streptococcus mutans tJA159 

55 <400> PreSequenceString : 

MSIKNILENK TTTIKVSFAG lATAASLILP MAVQAETTYT VKSGDTLSEI ASTHGTTVDK 60 
LAKLNKINNI HLIHAGQILE LDAATEDTDA TPVQESQINE AETSASAKTS QTSEVTTTAP 120 
VQESQTSEVI TSAPAETSQT SEVPTEANQT NEVSSAVSVE TSQTSEATTS APVETSQTSE 180 
ATTAEPTETK TSQTNEVAAS AEENQTTSNT SGLSTSDAAA KEFIAQKESG GNYNAKNGQY 240 

60 YGRYQLSDSY LNGDLSEENQ ERVADAYVSS RYGSWTAAQA FWNANGWY 288 
<212> Type : PRT 
<211> Length : 288 

SequenceName : SEQ ID 353 
SequenceDescription : 

65 

Secjuence 
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<213> OrganistnName : Streptococcus mutans UA159 
<400> PreSequenceString : 

MKCQAFEDFK ATSIiNKIiSYT TGGATDGEII ANRMLQGKAT KGEITMYTWN IIQNGWVNSL 60 
VSWGIGGYNS SIGYSAQGNR GFSNYPYDVS MDSDNSSSSS NTTGGYVNYN QSFNSGW 117 

5 

<212> Type : PRT 
<211> Length : 117 

SequenceName : SEQ ID 354 

SequenceDescription : 

10 

Sequence 



<213> OrganistnName : Streptococcus mutans UA159 
<400> PreSequenceString : 

15 MRYSQICRKS LALIiATGMIL TTSTLPSISI LAEDSTGAPA RPDGQAPAGG GANTTTYDYS 60 
GINSGVXiVAN GSKVTSSSKT KSTTSAQNTA LVQNGGSLTL HKANLIKSGD DNNGDISTDNFY 12 0 

GINSILLAVN ERSKAYVSNS KLKASSSGSN" 6IFATDKATI YANKTSIATT ADNSRGLDAT 180 
YNGNIIANKM AISTKGAHSA AIATDRGGGN ISTTNSSLNT SGSGSPLLYS TGNIQWHVT 240 
GTSSNSQIAG MEGLNTILIH NSNBISTMTN KTASDPIANG VIIYQSQSGD AEATTGQSAH 3 00 

20 FELSKSKLTS SITSGSMFYL TNTSANIILN QSTLNFDANK AKLLTVAGNS ANNWGTPGSN 360 
GATVOTTGHK QTLKGDVDVD SISTLNMYLL DKTNYTGKTA VSTNSTNISP STSPITMNIS 420 
KNSKPJVLTGH STVTNLMAEK GAKIVDKDGK TVSVISSSGQ KIjVKGKSKYS LTVTGTYSQK 480 
VTTSSSNKPS SSYINRSDFD NYFKTTTAFV NNTKNTSN 518 
<212> Type : PRT 

25 <211> Length : 518 

SequenceName : SEQ ID 355 
SequenceDescription : 

Sequence 

30 

<213> OrganismName : Streptococcus mutans XIA159 
<400> PreSequenceString : 

MNBCIGDTLRD ARIEKKLSFD DWDKTGIAP HYILAMELDQ LKLLPEGKTN EYLEKYAHAV 60 
GLDPVSIIHG YRNQEMSDEL ILPSSAELAA SSDSNIEKKN EGKSIEEPQE LAIDSLDVTQ 120 

35 NITEETPQIE DFKVESEEAS KKIEKIPSRL SKYDYDEEPK KKFPWALILL ILLALTIISY 180 
VGYWYNQLQ TDSNKTELST STKKSKDTKN DANSTTQSQT SITTDFADGG NNITLSNTNG 240 
KVEVTFTLTG DEESWVSATN TTDGESGTTL TATDKTYTVT LAEGSTTSML TVGSPSGVEI 3 00 

TXNGQKVDTT ISILVNAGLTNI NLTVQ 325 
<212> Type t PRT 

40 c211> Length : 325 

SequenceName r SEQ ID 356 
SequenceDescription : 

Sequence 
45 

<213> OrganismName : Streptococcus mutans UA159 
<400> PreSequenceString : 

MKSRKRQRKG LVRKNEIIIL TLFVASAVSL LAFTNSFGVL AKSLHLEKIN KSITISLPFG 60 
KKKMEQTARY YSGEQVQISS SAKKDSLGKG LSHYQNWIGT VKKIKSQKDS RQKHHYSYEV 120 

50 TFDNGKALKY VQEKDLVKTK RSKYSKGQIV KLKSSATADL DGSSLTDYRA SAGKIDHISY 180 
NHSNTTGGYK YDITFDEGGK VTNIQEKDLD KVYEVQLKSE NTAAQNNEIL KQAFAYAKQH 240 
SGTILSLPNG EFKIGSQTPD KDYITLTSDT EIRGDNTTLL VEGSAYWFAF ATGTSASDGV 3 00 

KNPTMRNINI KASDLEKGNQ FMIMADHGDN WKICNNSFTM VHKKGSHIFD LGSLQNSAFE 360 
GNQFTGYAPE LTNVSKIDDN ADLHDFYSEV IQLDAAESSG VWDGGLIKAI DPNYENYNKE 420 

55 KQLCNNITIA NNSFVPYIDS HGKIIAYSGT IGQHSSDVGL VKIYDNVFSN SLVSRFNQNG 480 
KSEAWIFKAI HLKSNYNNAV YANSIS 506 
<212> Type : PRT 
<211> Length : 506 

SequenceName : SEQ ID 357 

60 SequenceDescription : 

Sequence 



<213> OrganistnName : Streptococcus mutans UA159 
65 <400> PreSequenceString : 

MRKLKVALFA SSILGMLAVS SYTAADTEDN QVTISHYNEQ AGTFDVNAVQ AANGKTIQSI 60 
DVAIWSEENG QDDLKWYHAS NDGSNQLTVH FNAENHGSKV GSYIAHAYIT YTDGNRVGVN 120 
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LGKRKLSIjSA PQLSLKQGGL QLFSKLKPSA ADQLFSAVWS DENGQDDLHW YTADADGNTL 180 

AGYANHICGYG TYHVHTYLKQ NGKMIPISAQ DIDIPKPKVK IQIDKINDTS YDVWNNVPP 240 

YISSVAIPVW SEQNGQDDLK WYQATKVADG IFKTTVYLKT HRFEIiGNYQA HIYGDSQLSK 3 00 

KLDGLGETHF NVPSIINYED PQVTIDHYNI NKGTFDVTVA ETDNSKAIQS ISAAVWSDAN 3 60 

QANLYWYTSAK QLANGKAAIT VDVQKHGNQT GSYNVHVYVH YlsTDGTTSGHV IiANQQLNQIV 420 

HYQPSAVRIT AYMNBKNTYP VGQCTWGVKE LAPWIPNWLG NGGQWASTVA VKGFKIGTVP 480 

KVGAIACWSD GGYGHVAYVT HVES13NRIQV KEANYKNQQY ISNFRGWFDP TTSYLGRLTY 540 

lYPD 544 
<212> Type : PRT 
<211> Length : 544 

SequenceName : SEQ ID 358 

Sec[uenceDe script ion : 



Sequence 



<213> OxganisitiName : Streptococcus mutans UA159 
<400> PareSequenceString : 

MANNYSR.RQQ PTKKTKGTSR KRPTEHIKTG FSALQKSVAI lAGILGIITA LITINNYRNS 60 
SHNDKKDSTS KTTIIKEKEV DDSNSNl!3NAA NSQAENDSNN NNNSAESNQN QTATTANDSIST 12 0 

SNSANQUQAN SQSQANNQQN QNNANAGQ 148 
<212> Type : PRT 
<211> L.engtli : 148 

SequenceName : SEQ ID 359 

SeguenceDescription : 



Seqpaenc e 



<213> OrganisinName : Streptococcus mutans UA159 
<400> PareSequenceString : 

MKIFSFGtTIR NNTALKPNYD DTTAFSGFGT IRNNTALKQS TNCASWFNRF GTIRNNTALK 60 
LTIIilNGVSF CFGTIRNNTA LKPRGPXFVS TFRNRAIHLS QISASK 106 
<212> TypQi : PRT 
<211> Length. : 106 

SequenceName t SEQ ID 360 

SequenceDescription : 



Sequence 



<213> Organ israUame t Streptococcus mutans UA159 
<400> P resequences tring : 

MKRKRNIiYFI, IGI»FLTVFLIi IGCSMQKKTK SESSSTSQKT TLQTKQSSEK STDAKQTTEA 60 
HSESSQSSSH SNNEETX.API DTGAVIiKADY SSMAGTWKNE EGQTLTFDQR GLTTPGMTVS 120 
LLNIDQDGNIi LLNVETGTKK NLTIiYIVPAN KTLSNQYFSN GQSDESDKTK DRIVSSESLN 180 
SGKFTNR.VYY HVSTH 195 
<212> Type : PRT 
<211> Length : 195 

SequenceName : SEQ ID 361 

SequenceDescription : 

Sec[uence 



<213> OrganisraNarae : Streptococcus mutans UA159 
<4 00> P resequence String : 

MTPKKIKIAL TALISLMLAL FLFLFNHHSV RENSQQEKLK ISKASSKKSQ TSTSSVMTSS 60 
RKATEQXSQA QTQSQSQAEQ SNPNVILPIP QELVGTYKGS SPQASEITFT ISSNGQLRAQ 120 
ANFDPASDIN DVTATVSGVR KVGADTYIWE FVSGSSAALL PGVTGIGGLG KMQPGFILKG 180 
GQLTPIMFTG SVDGEIDYSH PNPYPVSIiNK Q 211 
<212> Type : PRT 
<211> Length : 211 

SequenceName : SEQ ID 362 . 

SequenceDescription : 



Sequence 

<213> OrganismName : Streptococcus mutans UA159 
<400> Pre Sequences t ring : 

MKKIIWIVL SLSVPFLIAC SNSSTGEKTS QSSEETKVRL IVKTDSNKTD EKVAFKKGAT 60 
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VMDVLKDNYK VKESGGFITT IDGVTQDKKA GRYWMFDVND KLASKAADKI KVKKTGDKIEF 120 
YLK^YKGKN" 129 
<212> Type : PRT 
<211> Length : 129 
5 SequenceName : SEQ ID 363 

SequenceDe script ion : 

Sequence 



10 <213> OrganismName : Streptococcus mutans UA159 
<400> PreSequenceString : 

MSNKPWEEKV TDATTDNEEM TRNSKDASII STPILTILLS LFFLIIIGIL FPVLYTSMGG 60 
SNEKAATSGF YSSSKTVKKA KNEANSQTDE QTTEAETSSS ETTSSSSDSD GETITVQGGE 120 
GAAAIAARAG ISVDKLYELN PEHMTHGYWY ANPGDNIKIK 160 
15 <212> Type : PRT 

<211> Length : 160 

SequenceName : SEQ ID 364 

SequenceDe script ion : 

20 Sequence 



<213> OrganismName : Streptococcus mutans UA159 
<4 00> PreSequenceString : 

MPDNRMNYSI DSNMQFPLVE ITLETGEFAY IQRGSMVYHT PSVTLNTKVN GRGSGLGKLV 60 

25 GAIGRSVTSG ESFFITQAVS NASDGKLALA PSMP6QVIAL ELGEKQYRLN DGAFLALDGS 120 

AQYQMKAQSV GRALFGGQGG LFVMTTEGQG TLLANSFGSI KKIELQNQEI TIDNAHWAW 180 

SRDLNYDIHL ENGFMQSIGT GEGWNTFR6 TGEIYVQSLN LQQFAGVLQG FITNTNR 237 



<212> Type : PRT 
30 <211> Length : 23 7 

SequenceName : SEQ ID 365 
SequenceDescription : 

Sequence 

35 

<213> OrganismName r Streptococcus mutans UAISS 
<400> PreSequenceString : 

MKKNYFWYGL LGLLALYLIT lAFIPGFHIF FSNMLMLALF FMLXALSNRS IFFFFLALGF 60 

LSIYLKDXFH PDYSTGPLFT GIIXIGVILN SFLKPHYSYS YKGNHYFNMK QHANYIDNET 120 
40 DVFLKTLFSB NTSYVTSQEL NKIIIDTKFG EQSVDLSQAQ FMTDSPEIHI DVSFGETNLR 180 

IPNNWKIINK THSPFASXSF SGFPSTNCTP INVTLTGTVA MGSLNIQY 228 

<212> Type : PRT 

<211> Length : 228 

SequenceName : SEQ ID 366 
45 SequenceDescription : 

Sequence 



<213> OrganismName : Streptococcus pneumoniae R6 
50 <4 00> PreSequenceString : 

MKSITKKIKA TLAGVAALFA VFAPSFVSAQ ESSTYTVKEG DTLSEIAETH NTTVEKLAEN 60 
NHIDNIHLIY VDQELVIDGP VAPVATPAPA TYAAPAAQDE TVSAPVAETP WSETWSTV 120 
SGSEAEAKEW lAQKESGGSY TATNGRYIGR YGSWTAAKNF WLNNGWY 167 
<212> Type : PRT 
55 <211> Length : 167 

Sec[uenceName : SEQ ID 367 
SequenceDescription : 

Sequence 

60 

<213> OrganismName : Streptococcus pneumoniae R6 
<:400> PreSequenceString : 

MKHSHKKSFD WYSMQQRYSI RKYYFGAASV LLGTALVLGA AASVQTVQAE ENKQETTNSI 60 
SVGRGEAATK PAEVSASNKE KTYAAPTVAN PVETTPVKTE EVTKPAEKVE EAKDKKEEVT 120 
65 HQDAVDKSKL LTALSRAKKL ESKLYTEASA ANLQTSIQAG QSLLGKADAT EAELSAAESS 18 0 

IQSFIIGLEL RSNSNKETVS ETPVAKKADA VESKEGAKPA ATTERSAVDS AILPTSTADK 240 
VETTSAPASI NEILKLGLSL SDARQNPAIR KEDVNRGYSG FRAASNPANP IVSGSGNTVA 300 
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FADISQGGRS YSFRGYGNSR GGNSIHYDVT TVRSGNSVNF TISYSAPGDS REFWNNFIL 3 60 

DKGDGB'GNPS NATITSSNPR VREQSKSISQ GANYVSHSGY SMTSAISTNT EQTIRFSLPI 420 

INLNGDLSVR LKPVTFNVDQ GGGGAATSND PYSNSNYYYR ANPLYLDANP YGGTNNKTVS 480 

EDIDFQTVYr. PTSKLPEGQT RLVREGEKGQ RQITYKVHRF GNETLLGLiPI SNSVTKEAKP 540 

5 RIMQIGVAKD LIDTVKPRVD QNKVGDTNNL TFYLDNDGNG VYTEGVDELV QKIAIKDGAK 600 

GEKGDQGERG LTGAKGEKGD RGERGLTGAQ GAKGEKGDRG ERGLTGAQGA KGEKGDRGER 660 

GLTGAQGAKG EKGDRGERGL TGAQGAKGEK GAQGERGLTG AQGAKGEKGD QGERGLTGAQ 72 0 

GAKGEKGDQG ERGLTGAQGA KGEKGAQGER GLTGTQGAKG EKGDRGERGL TGAQGAKGEK 780 

GDRGERGLTG AQGAKGEKGA QGERGLTGAQ GAKGEKGDQG ERGLTGAQGE KGDRGERGLT 840 

10 GAKGEKGDQG ERGITGAKGE KGAQGERGLT GAQGAKGEKG DQGERGLTGA QGEKGAQGQA 900 

GRDGVTPTVT VKDNKNDGTH TITINDGRGN VTSTWRDGF DGASPLVATQ RNDADKTTTV 9 60 

IFYYDKWGNW ELDASDKKLK EWIADGAKG EKGDKGEQGL QGRDGEQGPK GEDGKTPTVK 1020 

VTDGQDGTHT ITINDGKGGI TTTWRDGFD GASPLVSTHR NEADKTTTVJ FYYDLNDNNQ 1080 

FDEGDTKLKIE WIADGKQGP KGDKGD^TGKD GFTPEVTVTD NNNGTHTITI I'QPDNRPS-LT 1140 

15 TIVKNGEDGIC TPKVKAERDD AKKQTTLTFY IDKDGDGSYT AGKDELVQTT WKDGQDGAA 12 00 

GASGRDGKEV LNGKVDPTTE GKDGDTFWT QTGDVFVKKG NTWEPAGNIK GPKGDKGADG 12 60 

AKGEKGAQGE RGLTGAQGVK GEKGDQGERG LTGSKGEKGD QGERGLTGAQ GAKGDKGEQG 1320 

LQGRDGAQGP KGADGQRGPA GPQGPKGEQG NPGTPGKDGK SLIAVKNGVL VTITPVEGRP 13 80 

QTTFVEDGQEC GADGKTPTVT ITEGQNGTHT LTVHNPGSPD VTTTIRDGAT GQAGRDGKDV 1440 

20 LNGKVNPQPN: QGKNGDKYIN IETGDVYVKN NGNT-JDKEGNI KGPKGDKGAD GAKGEKGDQG 1500 

ERGLTGAQGi^S. KGADGAVGRD GRDGKDVLNG KANPEAHQGK DGDKYVNTET GDVFVKNNGN 1560 

WDKEGNIKGP KGDKGADGAK GEKGDRGERG LTGAQGAKGA DGAAGRDGRD GRDGKDVLNG 1620 

KVNPEANQGK DGDKYVNTET GDVFVKNNGN WDKEGNIKGS KGDKGERGED GKTPEVTVTP 1680 

GKDGHSTDIT FTVPGKDPVT VNVKDGENGL NGKTPKVDLL RVQGKNGNPS HTIVTFYTDE 1740 

25 NNDGKYTPGX DELLGSEMIK DGAKGADGRD GKSLLTVKDG KETKVYQEDP ANPGQPLNPE 18 0 0 

KPLAVIRDGV DGKSPTVTAV RKDEAGHKGV EXTVDNHDGS QPTTVFVQDG AKGKTGATGQ 1860 

DGQTPTITTQ RGQDGQSTW TITTSGKDPV TFTVKDGKNG KDGRAPKIKV EDITSPSRIR 192 0 

RDTDAAATPT RNGIRVTVYD DVNDNGVYDE GVDKVLNSKD lYNGIDGRDG SAPTITTKDN 1980 

GDGTHTITVQ NPDGSESTTV VKDGKDGKTA NITTTENPDG SHTITVTNPD GSTKETWKN 2040 

30 GKDGKTPKVE VTDNNDGTHT VKVTDGDGNV TNAIIKDGKD GKAATATTTE NPDGSHTVTI 2100 

TNPDGTKNEF WKNGRDGVD GRTPTASVRD NGDGSHTIVI TNPEGVTTET TVRDGKSPKV 2160 

TITDEQNGTEI KISVLNGDGT TTETXIKDGK SPVATVRDNQ DGTYTIRVEN GNGTVSETTV 2220 

RDGKSPTABCV VDNGDGTHTI TWNSDGITT TTTVRDGREP KLEVIDNNDG SHTIK/^TGAD 2280 

GKGTTTTIFD GKSPKANIVD NGDGTHTLTI VDSDGREYKS IIKDGKDGKD SVSPTVTVKN 2340 

35 NNDGTHWTX TWPDGSKTEM VIKDGKDGKS PKVSVEDNGD GSHTITIINS DGTVTKTVIK 2400 

DGKDGRDGRO GRDGKDGKDG KCGCQDKPVT PSNDKPVPPT PNVPTPEVPV KPVPAQPTPN 2460 

VPTPEVPVQP TPAVSTPEVP VKPVPAVPEQ PWPTPAQPA TPVNANPVAP TTGKENRGDK 2520 

LPETGSQSDY ISVLLGSGIL LSLYVGIiRKE D 2551 
<212> Type : PRT 

40 <211> Length. : 2551 

SecjiaenceName : SEQ ID 368 
SeqizenceDescription : 

Sequence 
45 

<213> OrganisraName : Streptococcus pneumoniae R6 
<400> Pre Sequenc est ring : 

MKKRMLLASX VALSFAPVLA TQAEEVLWTA RSVEQIQNDL TKTDNKTSYT VQYGDTLSTI 60 

AEALGVDVTV LANLNKITNM DLIFPETVLT TTVNEAEEVT EVEIQTPQAD SSEEVTTATA 12 0 

50 DLTTNQVTVD DQTVQVADLS QPIAEAPKEV ASSSEVTKTV lASEEVAPST GTSVPEEQTA 180 

ETSSAVAEEA. PQETTPAEKQ ETQTSPQAAS AVEATTTSSE AKEVASSNGA TAAVSTYQPE 240 

ETKIISTTYE APAAPDYAGL AVAKSENAGL QPQTAAFKEE lANLFGITSF SGYRPGDSGD 3 00 

HGKGLAIDFM VPERSELGDK lAEYAIQNMA SRGISYIIWK QRFYAPFDSK YGPANTWNPM 3 60 

PDRGSVTENH YDHVHVSMNG 380 

55 <212> Type : PRT 

<211> Length : 380 

SeqiaenceName : SEQ ID 369 
SeqcLenceDescription : 

60 Sequence 



<213> Orgs-nismName : Streptococcus pneumoniae R6 
<400> PreSequenceString : 

MTILGKDTVQ QSAKGESVTQ EATPEYKLEN TPGGDKGGNT GSSDANANEG GGSQAGGSAH 60 

65 TGSQNSAQSQ ASKQLATBKE SAKNAIEKAA KNKQDEIKGA PLSDKEKAEL LARVEAEKQA 120 

ALKEIENAKT MEDVKEAETI GVQAIAMVTV PKRPVAPNAA PKTTSAPQAT AGTMQDVTYQ 180 

SPAGKQLPNT GSASSAAIjAS LGLWATSGF ALLGRKTRRR K 221 
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<212> Type : PRT 

<211> Length : 221 

SequenceNanie : SEQ ID 370 
SeqxienceDe script ion : 

5 

Sequence 



<213> OrganismbTame : Streptococcus pneumoniae R6 

<40 0> Pre Sequences t ring : 
10 MMTTGCSMGA YHALNFFLQH PDVFTKVIAL SGVYDARFFV GDYYNDDAIY QNSPVDYIWN 60 

QNDGWFIDRY RQAEIVLCTG LGAWEQDGLP SPYKLKEAFD KKQIPAWFAE WGHDVAHDWE 120 

WWRKQMPYFL GNLYL 13 5 

<212> Type : PRT 

<211> Length : 135 
15 SeqTaenceName : SEQ ID 371 

SeqiaenceDescription : 

Sequence 



20 <213> OrganismName : Streptococcus pneumoniae R6 
<400> Pre SequenceSt ring : 

MNKGLFEKRC KYSIRKFSLG VASVMIGATF FGTSPVLADS VQSGSTANLP ADLATALATA 
KENDGHDFEA PKVGEDQGSP EVTDGPKTEE ELLALEKEKP AEEKPKEDKP AAAKPETPKT 
VTPEWQTVEK KEQQGTVTIR EEKGVRYNQL SSTAQNDNAG KPALFEKKGL TVDANGNATV 

25 DLTFKDDSEK GKSRFGVFLK FKDTKNNVFV GYDKDGWFWE YKSPTTSTWY RGSRVAAPET 
GSTNRLSITL KSDGQLNASN NDVNLFDTVT LPAAVNDHLK NEKKILLKAG SYDDERTWS 
VKTDNQEGVK TEDTPAEKET GPEVDDSKVT YDTIQSKVLK AVIDQAFPRV KEYSLNGHTL 
PGQVQQFNQV FINNHRITPE VTYKKINETT AEYLMKLRDD AHLINAEMTV RLQWDNQLH 
FDVTKIVNKN QVTPGQKIDD ERKLLSSISF LGNALVSVSS DQTGAKFDGA TMSNNTHVSG 

30 DDHIDVTNPM KDLAKGYMYG FVSTDKLAAG WSNSQNSYG GGSNDWTRLT AYKETVGNAN 
YVGIHSSEWQ WEKAYKGIVF PEYTKELPSA KWITEDANA DKKVDWQDGA lAYRSIMNNP 
QGWKKVKDIT AYRIAMNFGS QAQNPFLMTL DGIKKINLHT DGLGQGVLLK GYGSEGHDSG 
HLNYADIGKR XGGVBDFKTL lEKAKKYGAH LGIHVNASET YPESKYFNEK ILRKNPDGSY 
SYGWNWLDQG INIDAAYDLA HGRLARWEDL KKKLGDGLDF lYVDVWGNGQ SGDNGAWATH 

35 VLAKEINKQG WRFAIEWGHG GEYDSTFHHW AADLTYGGYT NKGINSAITR FIRNHQKDAW 
VGDYRSYGGA ANYPLLGGYS MKDFEGWQGR SDYNGYVTNL FAHDVMTKYF QHFTVSKWEN 
GTPVTMTDKTG STYiCWTPEMR VELVDADNNK VWTRKSNDV NSPQYRERTV TLNGRVIQDG 
SAYLTPWNWD ANGKKLSTDK EKMYYFNTQA GATTWTLPSD WAKSKVYLYK LTDQGKTEEQ 
ELTVKDGKXT LDLLANQPYV LYRSKQTNPE MSWSEGMHIY DQGFNSGTLK HWTISGDASK 

40 ABIVKSQGAU DMLRIQGNKE KVSLTQKLTG LKPNTKYAVY VGVDNRSNAK ASITVNTGEK 
EVTTYTNKSIi ALNYVKAYAH NTRRNNATVD DTSYFQNMYA FFTTGSDVSM VTLTLSREAG 
DEATYFDEXR TFENNSSMYG DKHDTGKGTF KQDFENVAQG IFPFWGGVE GVEDNRTHLS 
EKHDPYTQRG WNGKKVDDVI EGNWSLKTNG LVSRRNLVYQ TIPQNFRFEA GKTYRVTFEY 
EAGSDNTYAF WGKGEFQSG RRGTQASNLE MHELPNTWTD SKKAKKATFL VTGAETGDTW 

45 VGIYSTGNAS NTRGDSGGNA NFRGYNDFMM DNLQIEEITL TGKMLTENAL KNYLPTVAMT 
NYTKESMDAX. KEAVFNLSQA DDDISVEEAR AEIAKIEALK MALVQKKTAL VADDFASLTA 
PAQAQEGLAN AFDGNLSSLW HTSWGGGDVG KPATMVLKEA TEITGLRYVP RGSGSNGNLR 
DVKLWTDES GKEHTFTATD WPDNNKPKDI DFGKTIKAKK IVLTGTKTYG DGGDKYQSAA 
ELIFTRPQVA ETPLDLSGYE AALAKAQKLT DKDNQEEVAS VQASMKYATD 'NHLLTERMVE 

50 YFADYLNQLK DSATKPDAPT VEKPEFKLSS VASDQGKTPD YKQEIARPET PEQIIiPATGE 
SQFDTALFLA SVSLALSALF WKTKKD 
<212> Type : PRT 
<211> Length : 1767 

Seq:u.enceName : SEQ ID 372 

55 Seq^enceDescription : 

Sequence 



<213> OrganismName : Streptococcus pneumoniae R6 

60 <400> Pre SequenceSt ring : 

MKLYNKSELR YSRIFFDKRP PAFAFILIIS TAIILSGALV GAAYIPKNYI VKANGNSVIT 60 

GTEFLSAISS GKWTLHKSE GDMVNAGDVI ISLSSGQEGL QASSLNKQLV KLRAKEAIFQ 120 

KFEQSLNEKY NRMSNSGEEQ EYYGKVEYYL SQLNSENYNN GTQYSKIQDE YTKLNKITAE 180 

RNQLDADLQT LQNELIQLQQ QGDSPSLSDT TSADDKAKLE TKILEITTKI EALKTNITSK 240 

65 NSEIDSQQSN IKDMNRTYND PTSQAYNIYA QLVSELGTAR SNNNKSITEL EANLGVATGQ 300 

DKAHSILAPIT EGTLHYLVPL KQGMSIQQGQ TIAEVSGKEK GYYVEAFVLA SDISRVSKGA 3 60 

KVDVAITGVlSr SQKYGTLKGQ VRQIDSGTIS QBTKE6NISL YKVMIELETL TLKHGSETW 420 



60 
120 
180 
240 
300 
360 
420 
480 
. 540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1767 
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LQKDMPVEVR IVYDKETYLD WILEMLSPKQ 
<212> Type : PRT 
<211> Length : 450 

SequenceName : SEQ ID 373 
5 Sec[uenceDescriptioii : 

Sequence 



450 



<213> OrganismName : Neisseria meningitidis Z2491 

10 <4 00> PreSequenceString : 

MNKGLHRIXF SKKHSTMVAV AETANSQGKG KQAGSSVSVS LKTSGDLCGK LKTTLKTLVC 60 

SLVSLSMVLP AHAQITTDKS APKNQQWIL KTNTGAPLVN IQTPNGRGLS HNRYTQFDVD 12 0 

NKGAVLNNDR NNNPFLVKGS AQLILNEVRG TASKLNGIVT VGGQKADVII ANPNGITVNG 18 0 

GGFKNVGRGI ■ LTIGAPQIGK DGALTGFDVR QGTLTVGAAG WNDKGGADYT GVXiARAVALQ 240 

15 GKLQGKNLAV STGPQKVDYA SGEISAGTAA GTKPTIALDT AALGGMYADS ITLIANEKGV 3 00 

GVKNAGTLKA AKQLXVTSSG RIENSGRIAT TADGTEASPT YLSIETTEKG AAGTFISNGG 3 60 

RIESKGLLVI ETGEDISBRN GAWQNNGSR PATTVLNAGH NLVIESKTNV NNAKGSANLS 420 

AGGRTTINDA TIQAGSSVYS STKGDTELGE NTRIIAENVT VLSNGSIGSA AVIEAKDTAH 480 

IESGKPLSIjE TSTVASNIRL NNGNIKGGKQ LALIiADDNIT AKTTNLNTPG NLYVHTGKDL 540 

20 NLNVDKDLSA ASIHLKSDNA AHITGTSKTL TASKDMGVEA GLLNVTNTNL RTMSGNLHIQ 600 

AAKGNIQIiRN TKLNAAKALE TTALQGNIVS DGLHAVSADG HVSLLANGNA DFTGHNTLTA 660 

KADVNAGSVG KGRLKADNTN ITSSSGDITL VAGNGIQLGD GKQRNSINGK HISIKNNGGN 720 

ADLKNLNVHA KSGALNIHSD RAIiSIENTKL ESTHNTHLNA QHERVTLNQV DAYAHRHLSI 78 0 

TGSQIWQNDK LPSANKLVAN GVLALNARYS QIADNTTLRA GAXNLTAGTA LVKRGNINWS 840 

25 TVSTKTLEDN AELKPIiAGRL NIEAGSGTLT lEPANRISAH TDLSIKTGGK LLLSAKGGNA 900 

GAPSAQVSSL EAKGNIRIiVT GETDLRGSKI TAGKNLWAT TKGKLNIEAV NNSFSNYFPT 960 

QKAAELNQKS KELEQQIAQL KKSSPKSKLI PTLQEERDRL AFYXQAINKE VKGKKPKGKE 1020 

YLQAKLSAQN IDLISAQGIE ISGSDITASK KLNLHAAGVL PKAADSEAAA ILIDGITDQY 10 80 

EIGKPTYKSH YDKAALNKPS RLTGRTGVSI HAAAABDDAR IIIGASEIKA PSGSIDXKAH 1140 

30 SDXVLEAGQN DAYTFLKTKG KSGKIIRKTK FTSTRDHLIM PAPVELTANG ITLQAGGNIE 12 00 

ANTTRFNAPA GKVTLVAGEE LQLIiAEEGIH KHELDVQKSR RFIGIKVGKS NYSKNELNET 12 60 

KLPVRWAQT AATRSGWDTV LEGTEFKTTIi AGADIQAGVG EKARVDAKII LKGXVNRIQS 132 0 

EEKLETNSTV WQKQAGRGST XETLKLPSFE SPTPPKLSAP GGYXVDIPKG NIiKTEXEKLS 13 80 

KQPEYAYKKQ LQVAKNINWN QVQLAYDRWD YKQEGLTEAG AAXXALAVTV VTSGAGTGAV 1440 

35 LGLNGAAAAA TDAAFASLAS QASVSFINNK GDVGKTLKEL GRSSTVKNIiV VAAATAGVAD 1500 

KXGASALimV SDKQWXNNIiT VNItANAGSAA LINTAINGGS IiKDNLGDAAL GAIVSTVHGE 1560 

VASKIKFMXiS EDYXTHKXAH AXAGCAAAAA NKGKCQDGAI GAAVGEIVGE ALTNGKNPAT 1620 

LTAKEREQXXi AYSKLVAGTV SGWGGDVKT AANAAKVAXE NNLIiSQEEYA LREKLIKKAK 1680 

GKGLLSLDWG SXiTEQEARQF XYBXEKDRYS NQI»LDRYQKN PSSLNNQEKN XLAYFINQTS 1740 

40 GGNTAWAASX LKTPQSMGNL TXPSKDXKTMT LSKAYQTIiSR YDSFDYKSAV AAQPALYLLN 1800 

GPLGFSVKAA TVAAGGYNXG QGAKAISNGE YLHGTVQWN GTLMVAGSVS AQAAXSAKPA 18 60 

PVTRYLSNDS APALRQALTA ESQRXRMKLP EEYRQIGNLA lAKIDVKGLP QRMEAFSSFQ 1920 

KGEHGFISLP ETKIFKPXSV DKYHNIASPP RGTLRNIDGE YKLLETIAQQ LGNNRNVSGR 1980 

IDLFTELKA.C QSCSNVILEF RNRYPNIQLN IFTGK 2015 

45 <212> Type : PRT 

<211> Length : 2 015 

Seqxiencelsrame : SEQ ID 374 
SeqxienceDe script ion : 

50 Sequence 



<213> OrganistnName : Neisseria meningitidis Z2491 
<4 00> PreSequenceString : 

MDLIQTPNKQ FVDGDRRTPG TPVPAWt'JIiNQ LQGELYSILN AVGXEPNKAD HAQVLSAIKT 60 

55 LAADASQVAS IDALRKYSGT GYVNVNAYHA NTTVGGGVFV ADKADKSTAD NGCTVIVSTD 120 

GTRWKRVFSG MLNLHDFGYV ASKNNALSTL NAAESAALDV WDCLGLSID TGNIYPQKNK 18 0 

YTNGKFVING KTVDVQYQPI RSGIGRFISG TGAAANLKSN EWTGAGLIVI GEGAMEQMEK 240 

CVSSIAI6DR AQGFSKVSRD NIAIGADSLI NVQAATEWYD QSRMEGTRNI GIGGNAGRGI 3 00 

TSGYSNVSIG RNAGQGLGEG SSNIALGAGA MAGTAPVGFS GDIEVFWPSS TSRTIAIGEA 360 

60 VLQTYQGRAA QTAIGANAAR NTKKAEKVTA IGSAAMENLE RNRAPNGGDV WTGTEAGTY 420 

AQSGKNITIiT FPNIRGAQAT YWVGIRLTSG TAQTLQNDW PAQWSVNGN TLXXQSSKEL 4 80 

TATGAAEIiKY VYSVNSTATK NEELTIIGAN AMNKALTAGY STIIGVDAAIj LGDNYQKTTA 540 

IGASSLRTGS HISTTAIGYW VIPLASSEKC VAIGDSAGYR NVQGDFLTGK ITNSIAIGYG 60 0 

ARINGDNEIQ IGTTGQTLYA PTAVNIRSDG RDKADVKPLT NGLDFVMKLK PMTGYYDRRD 660 

65 SYVDELFKX)L PADERADKVR EWWANPIKDG SHKEDRLRHW FIAQDIAALE DBYGRLPMVN 720 

KTNDTYTVEY ETFIPVLTKA IQEMAARIET liBTEMKESKK 760 
<212> Type : PRT 
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<211> Length r 760 

SequenceName : SEQ ID 375 
SequenceDe script ion : 

Sequence 

<213> OrganisniName : Streptococcus pyogenes MGAS8232 
<400> PreSequenceString = 

MKKISRKCFM TSWCIILGG ILLGAGYATG GLQDIKHQTA PKKVIKTFDQ ITALDIDSSA 60 

STITVETGPV QRPTVTYYTH PKFIDPIVTT LTGKTIiSLSQ KPKDIVITGG lEILGFTLNN 120 

SRQEKNYRSI TITVPEKTSL NEVKGSNVPH TTLSNLTVQD MQFDGNLTLL HTKVKKATIT 18 0 

GMLEATKSQL TNLELKADYS FSISTLTDSSVE NGTISLGNGQ LTTKDTTLKA INIQSLHPGG 24 0 

lEAERTTLEN VTFTVSKSKE EEEENDYYDN DAIFTAHALT LKGTNTISGG DIDVDITLTK 3 00 

AKAIAYRART ENGKVSLGSQ LTE>AKIGKES TSDVXSYVAE NKAATGNLTV NLNKGDITIK, 360 

<212> Type : PRT 
<211> Length : 360 

SequenceKTattie : SEQ ID 376 

SequenceDescription : 

Sequence 



<213> OrganismName : Streptococcus pyogenes MGAS8232 
<400> PreSequenceString = 

MFKKENLKQR YFNFGLVALA LTIIiAIIFAF SSKNADTKSY AKKSESKMVT IDKAPKNNHA 60 
ITKEESKEKA KSIASEPIPT VEJSTSVAPTVT EEAPWQQEV TQTVQQVSSV AYNPNNWLS 120 
NGNTAGIVGS QAAAQMAAAT GVPQSTWEHI lARESNCSCfPN" AANASGASGL FQTMPGWGST 18 0 

ATVEDQVNAA LKAYSAQGLS AWGY 204 
<212> Type : PRT 
<211> Length : 204 

SequenceName t SEQ ID 377 

SequenceDescription. : 

Sequence 



<213> OrganismName : Streptococcus pyogenes MGAS823 2 
<400> PreSequenceString = 
MLEELKTLIK NPKLMITMIG VAXiVPALYNL 
DKSIiTIGNDM VDKMSKSKDL DYHFVSSKSA 
PQKLiTIRYQT SKGHGMVAAK MCTITAMAKLK 
TGSQALDSGA KTAQMGSQML SDl^JLAGLSSA 
QLSTDMPVYL NGVSRLSQGA SQIiNQGLSQL 
LNENLSTMQV PKLNTDELGN NIiAAIAQAAQ 
QGELTAALTQ TDKGEAVAPA QTXLRSVQTL 
NQALPGASSA LTELSTGLAK VWGSLNQQVL 
ANALSSKSGE LLDGSHQLSE GATKLADGS S 
SQQLSLVSVT DKNAKAVAKP LVXiNEKDKDG 
SLSGRPVKDK WDWAKQKFVI NGFISTMGSI 
MALVTALVGW DDRYGSFASL VMLLLQVGSS 
QTISLSGHIG VEVKVLTGFL LAFMVLSLLI 
<212> Type : PRT 
<211> Length : 757 

SequenceName : SEQ ID 3 78 

SequenceDescription : 

Sequence 



<213> OrganisraKTame : Streptococcus pyogenes MGAS8232 
<400> PreSequenceString : 

MSRDPTYTIN EHDLSFADGR FYVTFKADKS SETVRLNSSC LGNTIIKKLQ VEDDNTMHDF 60 

VKPKVTTQQA FGLAQQVKEL DLQLKDPKSD LWGKIKFKNK AMLVEYANKE MSSAIAQSAE 120 

QILLQVKSID DERYSKFEQT LMGIKQTVKS ESVESARTQL ASMFDSRISG LDGKYSRLSQ 180 

TIDSLSSRLD DGVGNYSTLS QKVSGIDLRV SNAANDVSRL SQTAQGLQSQ ITNANQNYSS 240 

LSQTVQGLQT TVRDNQSNAT SRINQLSDIiI STKVSKGDVE TTIAQSYDKI AFAIRDKLPA 3 00 

SKMSGSEIIS AINLDRSGVK ITOKNITLDG NSYISNAVIK DAHIANMDAG KINTGYLNAN 360 

RIATEAITGE KIKMDYAFFN KLTAUEGYPR TLFAKDIFAT SVQSVTLSAS KITGGVLAAT 420 

NGASQWDLNN ANMTFNiyDAT IKTFNSKNNAL VRKDGTHTAF VHFSNATPKG YRGSALYASI 480 



SFLGSMWDPY GRVISTDLPIAV 
QKGLKKGDYY MVXTLPEDLS 
ESVSQNXTKT YTSAVFSSMT 
SWQFQQGTNR LTSGLTAYTA 
TQSTTIiSDDK AKRIQSIiEVG 
QLLVKEAAAH KEQLAVLQAT 
STSLQSLSQE DQSKQLEQLK 
PGSNQLTTGL AQLNRYNTAI 
QLSQGGHQLT SGLTELSTGL 
VKTNGIGMAP YMIAVSLMW 
VLYLAIQIiLG FEARYGMETL 
GGSYPIELSG AFFQKLHPFL 
YRPKKTV 



WHDKPAKRA 60 

QRATTLLNPE 120 

DLQSGIiKEAS 180 

GVSQVKDGLG 240 

LPVLNQGIQQ 3 00 

SAYQSLTAEQ 3 60 

EAVAQIANQS 420 

GSGVIKLSEG 480 

SILNGSLAKA 540 

ALSTNVIFAN 600 

GFIMLSGWTF 660 

PMTYWSGLR 720 
757 
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GITSSGDGID SASSGRFAGL RSFRYATGYN HTAAVDQTEL YGDNVLIADD FSINRGFKFR 540 

PDKMEKVLDM NDLYAAWAL GRCWGHIiANV GWNTAHSNFT SAVSREIiNNY ITKI 594 

<2L2> Type : PRT 
5 <211> Length : 594 

SequenceName : SEQ ID 379 
SequenceDe script ion : 

Sequence 
10 

<213> OrganistnName : Streptococcus pyogenes MGAS8232 

<4 00> PreSequenceString : 

MAADGKVTIL VDVDGKQVKV LNSELDPCVAK HGDKGSSSLK KFAVGAGVFK LASAAVDLVS 60 

QSLGKAITRF. DTLEKYPRVM KAMGHSAEDV ARSTDKT'^VNG IDGLPTTLDE WGTAQRIiTS 12 0 

15 ITKDINKSTN LTLALNNAFIi ASGASSEAAS RGLEQYAQML SAGKVDMQAW KTLQETMPYA 18 0 

LQQTAEAFGF AGASAQKDFY EALKNGQITF DQFSNKLIEIi NDGVGGFAEL AKENSKGXET 240 

SFNNIKNAIA KGVANSIKAL DDLSKAATGK GIADHFDSLK WINASFSAI NASIKASTPL 3 00 

FKLLFSVIGA GISWKAIaSP ALVGVASGLA AMRAVNETIT MIKAIiNRAWV MASASMSIGA 3 60 

TTIKTVTAVQ AVSTTMTKAD MVARLSQLGV LKASWIYGV MTGAISLSTA ATIASTAAVT 42 0 

20 ALKAALVALT GPVGWWGAI GALVAVGVSL WSWLTKESDE TKKLKKEQEG LVESNKQLRD 48 0 

SVREGVQERK KGLESVKEST AAHQKQADEI IKLAAKENKT AGEKQNLKNK IDQLNGSIDG 54 0 

LNLAYDKNSN SLSHNADQIK SRISAMEAES TWQTAQQNLL, NIEQKRSEVS KKLAENADLR 60 0 

KKWNEEANVS DSVRKEKIAE LTEEEAKLKN MQTQLQEEYN KTSATQQAAA DAMAAAEESG 660 

SARQVIAYEN MSEAQRTAID NMRTKYSELL ETTTSIFDAI EQKTALSVDQ MNTNLEKNRA 720 

25 ATEQWATNLE ILAQRGVDQG ILEQLRRMGP EGATQTQVFV DATDAELAPL QENFRAATET 78 0 

AKN-AMGSVLD SAGVEMPEKV KGMVTNVSTG LQAELQAANF AQLGQEIPNG VSQGISQGAG 840 

KASDASVKMG QEVKRSFQGE LGIHSPSRVF TEYGGHITDG LSNGVTNGTS KVMQTMQSIiA 90 0 

QQMSQKGQQI VNDMRSKSNQ ITDAFSTMSG PMHSHGVNAM QGLANGIYAG SGAALAAAQS 9 60 

lAARITATIQ SALDIHSPSR VMRDEVGRFI PQGIAVGIDA DRKVIDSSMQ KLKESMTINA 1020 

30 TPEIASGFGG GVAGIANQTT NNSNNSFTLN VKVDESDGNS HEKYQRLFRE FSWYIQQQQG 1080 

RLGDVK 1086 
<212> Type t PRT 
<211> Length : 1086 

SecjuenceName r SEQ ID 380 

35 Seq[uenceDescription r 

Sequence 



<213> OrganistnName r Streptococcus pyogenes MGAS8232 
40 <400> PreSequenceString : 

MAKEPWEEKI VDDTXGTRTR KSRNAFISTP WLTALLSVFF VIIVAILFIF FYTSNSGS15R 60 
QAETN6FYGA STHKKTRKAS NAKKTSSSST TTDTTPSSEE TLASSEGTGE TLTVLAGEGA 120 
ASIAARAGIS VEQLQALNPE HMTQGYWYAN PGDQVTIK 158 
<212> Type : PRT 
45 <211> Length : 15 8 

SequenceName : SEQ ID 381 
SequenceDescription : 

Sequence 
50 

<213> OrganismName : Streptococcus pyogenes MGAS8232 
<400> PreSequenceString : 

MSKRGKIKIT TKTKLITASV ITLVLIITGV VLWKQQQNTL TADIAKEPYS TVSVTEGSIA 
SSTLLSGTVK ALSEEYIYFD ANKGNDATVT VKIGDQVTQG QQLVQYNTTT AQSAYDTAVR 

55 SLNKIGRQIN HLKTYGVPAV STETNKDEAT GEETTTTVQP SAQQNANYKQ QLQDLNDAYA 
DAQAEVNKAQ lALNDTWIS SVSGTWEVN NDIDPSSKNS QTLVHVATEG QLQVKGTLTE 
YDLAIWKVGQ SVKIKSKVYS NQEWTGKISY VSNYPTESNA GSTTPAGSTG AGSSTGAAYD 
YKIDIISPLNT QLKQGFTVSV EWNEAKQAL VPLTAVIKKD KKHYVWTYDD ATGKAKECVEV 
TLGNADAQQQ EIHKGVAVGD IVIANPDKNI KPDKKLEGVI SIGTNTKPEK DSQSKNKKSG 

60 VDK 

<212> Type : PRT 
<211> Length : 423 

SecjuenceName : SEQ ID 382 
SequenceDescription : 

65 



60 
120 
180 
240 
300 
360 
420 
'423 



Sec[uence 
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<213> OrganistnKrame : Treponema pallidum 
<400> PreSequenceString : 

MLRLPTARAC ITMGTMIRHT FTHRCGALLC ALALGSSTMA ATAAAKPiCKG QMQKLRQRPV 60 
WAPTGGRYAS LDGAPTALAN DASFFEANPA GSAJSnyiTHGEIi AFFHTTGFGS FHAETLSYVG 120 
5 QSGNWGYGAS MRMFFPESGF DFSTTTEPVC TPASNPIKQR GAIGIINFAR RIGGLSLGAN 180 
LKAGFRDAQG LQHTSVSSDI GLQWVGNVAK SFTSEEPNLY XGLAATNLGL TVKVSDKIEN 240 
CTSTCEKCGC CKERCCCNGK KACCKDCDCN CPCQDCNDKG TVHATDTMLR AGFAYRPFSW 3 00 

FLFSLGATTS MNVQTLASSD AKSLYQNLAY SIC3AMFDPFS FLSLSSSFRI NHKANMRVGV 360 
GAEARIARIK LNAGYRCDVS DISSGSGCTG AKASHYLSLG GAILLGRIT 408 
10 <212> Type : PRT 

<211> Length : 408 

SequenceName : SEQ ID 383 

SequenceDescription : 

15 Sequence 



<213> OrganisttiName : Treponema pallidum 
<400> PreSequenceString : 

MSRTFRAWQC VGALCALSPL LPAYSSEGVR EVPPSQSPQV WAYEPIRPG DQLLKIGIVA 60 
20 GCQLYIAGGN GTNGSSSSGT NGNGNGKLLG GGGFHLGYEY FFTKNFSIiGG QVSFECYRTT 120 

GSNYYFSVPI TVNPTYTFAV GRWRIPLSLG VGLNIQSYLS KKAPGLIAEA SAGIiYYQYTP 180 

DWSIGGIVAY TQLGDIASSP DKCRAVGIiAT IDKOVRYHF 219 

<212> Type : PRT 

<211> Length : 219 
25 SequenceName : SEQ ID 3 84 

SequenceDescription : 



Sequence 

30 

<213> OrganismName : Escherichia 
<40 0> PreSequenceString : 
atgataaatt taagtaagga agcaacggtg 
atgatgttgt cttttcctgt agcttctcaa 

33 gtatataacg ccaatggtgt gccagtcgtt 
tctcataata tctgggataa cctaaacgtt 
gctaatgaat ccagtacttc acttgccgga 
gggtcggcga aggtgatcct gaatgaggtt 
atgatggaag ttgcagggga taaagcggat 

40 gtaaacggtg gcggttcaat caatacaggt 
atccaggatg acaagctggc cggttactcc 
ctggataacg ccagcccgac agaaattctg 
tctgccgatg agctgaacgt tgttgctggc 
accggtagcg tatccgccac ggggtcccgt 

45 ggcggaatgt atgcgaacaa aatcagtctg 
aacctcggcg ttattgctgg gggtgttaat 
ttaaacagta acgcccagat tcagtctgca 
ctggataaca ccaccggtac ggtgacatct 
aatactatcg tgaatacccg tgcgggtaac 

50 agcggtacga ttgacaatac taacggcaag 
accaataacg ccacgctgat taactctggt 
ctcgtggcgc tgaaaaccgg aacgctcaac 
gtgggtcttg aatccgctgc gctgaataac 
atcgccatta tcagtaacgg taatgtggat 

55 gggcatatcg ttattggcgc ggcaggtagc 
accggcagtt ctgactctct gggcattatt 

* aacatcaata acaacggcgg acagattgcg 
agcacgatcg acgactatgc gggcaaaatt 
agctctctgc gtaacgatac cggggggatc 

60 ggcggcagcc tgaccaataa tattggcgtg 
ttagccaact ccgtggataa ccacggcggc 
tcgatgtctg gcgtcaataa caacacagcg 
aatgcgcgcg gcagtatcga aaaccgcgat 
tacttcggca tgcctcagca aacgggtgga 

65 gggcagaaca tctataacaa caacagccgt 
caggcgcaga acacgttcga caacacgcgt 
attcaggttg gcggaacgta ttacaacaac 



coli 0157:H7 

gggaaagcat taacccctat tgctatactt 60 

gcggcgggat tagtcataaa aaatggaacg 120 

gacatcaaca aacctaacgg tagcggttta 180 

gataaaaatg gtgtcgtttt caataatagc 240 

aatattcagg gaaacagtaa tctgacctcc 3 00 

acttccaaaa atccttcaac cattaatggg 3 60 

ctgattattg ccaacccgaa tggtattact 420 

aaacttacct taaccaccgg gacgccggat 480 

gtgaacggcg gtaccattac gctcggtaaa 540 

tcccgtaacg tggtagttaa cggcaaagtg 600 

aataactatg ttaatgccgc aggccaggtg 660 

aacggttaca gcgtagatgt tgccaaactg 720 

gtcagcaccg agaaaggtgt gggggttcgc 780 

ggtgtcagca tcgattccaa aggtaacctg 840 

agcacgatca acctgacaac aaatggtact 900 

gtaggcacta tctcgcttaa taccaacaag 960 

atctctacga tgggcgatat ctacgttaac 1020 

cttgcggctg caggaatgct ggcggttgat 1080 

aaagggagtt ctgtcgggat tgaagcgggg 1140 

aa.cagcaatg gtcagattcg cggtggctat 12 00 

aacaacggtg atatccagac caccggcgat 1260 

aacaacaaag gtctgatccg ttcgtccacc 1320 

gtaaataatg gttcaaccaa aaccgccgat 13 80 

gcagataccg gcgtagaaat tggtgcgaac 1440 

tctaatggca acgtctccct gtcaagttac 150 0 

ctgtccaaca gcaaagtgat tatcaaggga 15 60 

agcggtaagc agggtattga agtcgccgtt 1620 

atcagctctg aagagggtga tatctccctg 1680 

ttcatgatgg ggcagaacat cacgatggag 1740 

ctgatcgtgg ccagcaaaaa actgaagata 1800 

ggcaataact tcggtaatgc ttatggtctg 1860 

atggtcggca aggaaggcat cgagctttcc 1920 

cttatcgctg aggatggtcc tctgactctg 1980 

gctctggtca ccagcggggc ggatgcatct 2040 

tacgctacca cctggagtgc gggcaacctg 2100 
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gatatcgacg cgaccacgct gcaaaacagc agcagcggta cgatgatcga taacaatgcg 2160 

accgggttca tagcatctga taaaaacctg tcactggaag tggtgaatag ccttaccaac 222 0 

tacggctgga tcagcggtaa aggcgatgtt gatgtcacgg tgaataacgg caacctgtat 228 0 

aaccgcaata ccattgcggc tgaaaagggg ctggatattg ccgcgttgaa cggtattgaa 2340 

5 aactggaagg atatttctgc tggcggcgac ctgacgatga acaccaatcg ccatgtgacc 24 00 

aacaactcca acagcaatat ggtggggcag aatattgtta ttaacgcggt taacgatatc 2460 

aacaaccgtg gcaacattgt cagtgacgct gacctgaacg tgacgaccaa aggcaacctg 2520 

tataactatc tctatatggt agggtatggg gatatcgcat tgtcggcaaa tagcgtggcg 2 580 

aacaataacg cgaccatcga agcgacaggc gatctgatta tcgattcgaa gggtaacgtg 264 0 

10 ggtaacaacc gcggtaatct gcatgcgttg aacggcgtgt tgtctgttaa aggcaacaafc 2700 

ctgaacaacg ataacggtga aattcgfcggt tatggcgatg tcacgctggc actgacgggc 2760 

aactacgaca gctataaggg ttcgctgacc tctgaaacgg gcgacgtgac tctgacggcg 2820 

aacattgtag acaacgccta tggtttgatt gccggtgaga atgtttctgt cgatgctaaa 2 880 
tcgacgattt acaacaacac tgcgqtgatc gcggcgaat?. aaaagctggt tg.ttaacgct 294 Q 

15 ggcggcaacc tcgaaaaccg cgacgggaat aacttcctgc gtaataacgg cgcgctgttt 3 0 00 

ggaattaccg acaacgttgg cggcatcgta ggtaaagaag gtgtcacgct ttctgctcag 3 060 

aacgtctaca acaataacag cagcatcatc gctgaaaatg gtccgcttaa tctgctgtcc 312 0 

aggggaacgc tggataatac ccgcgcgctt cttagcagtg gggctgatgc catcatccgt 318 0 

gcggcaggga cgttctacaa caactatgcc accacgtaca gcgccggtaa tctcgacgtt 3240 

20 tatgcggcgt cgttgaacaa cgccagcgat ggtcgcctgg aagacaatac cgccacgggc 33 0 0 

gtgattgcgt ctgacaaaaa cctggatctg agcgttgata acagtgtcac taactatggt 3 3 60 

tggatcagcg gtaaaggaga tgtgcatttc aatgttctga aaggcacgct gtataaccgt 342 0 

aatgccatcg cggcggacaa cgcgctgacc attaatgccc tgaacggtgt tgagaacttt 3480 

aaagacattg tggcgggtac tgcgctgact attgatacgc agaagtatgt taccaacaac 3540 

25 agcaacagta atatgttggg acaaaccatc gcgatcaatg ccgtgaatga cattaataac 3600 

cgtggaaata ttgtgggtga ttattctctg ggtgttaaaa ccaccggtaa tatttataac 3660 

tacctcaata tgctgagtta tggtgtcgct ggcgtatcgg caaataaggt tacgaatagc 3720 

ggtaaagacg ctgttctcgg tggcttctac ggtttagcgt tagaagcaaa cgaaactgat 3780 

aacaccggta ctattgtcgg catgtaa. 38 07 

30 <212> Type : DNA 

<211> Length : 3807 

SequenceName z SBQ XD 385 
SequenceDescription : 

35 Sequence 



<213> OrganistnName : Escherichia coli 0157:H7 
<400> PreSequenceString t 

gtgaacacaa tacacttgcg ctgtctcttc aggatgaatc ccctggtctg gtgcctgtgg 60 

40 gctgatgttg cagcaaagct aaggtcgctt aaacgctact cagtattcac ttttcagagg 120 

atgaaattta tgaacaggac cagtccctat tattgtcgtc gctcagtact ttccttattg 180 

" atatctgcct tgatatatgc cccgcccggg atggctgcct tcactcctga tgttattggt 240 

gtggtaaacg atgagactgt agatggcagc caacgagtag atgaacgagg tacaacaaat 3 00 

aacactcata ttatcaacca tggccagcag aatgtttatg gcggggtatc taatggaagt 360 

45 cttattgaat ctggtggata tcaagatgta ggaaggcata acaattatgt ggggcagtct 420 

aataatacca ccattaacgg gggcagacag tcaattcatg acgggggtat ttccacaggt 4 80 

acgataatcg agagtggcaa tcaggacgtt tataaagggg gtatcagcaa tggaacgaca 540 

attaagggcg gtgcttcacg cgtagaggga gggagtgcga atggaacact cattgatggt 60 0 
ggtagccaga tagtaaaagt tcaagggcat gctgatggta caacgataaa taagtctggc 660 

50 tctcaggacg tagtacaagg aagtctggca acgaacacaa ccataaatgg tggtcgacag 72 0 

tatgttgaac agagcacagt agaaacaacc accatcaaaa atggcggtga gcaaagagta 78 0 

tatgagagcc gtgcgctgga cacgacgatt gaaggcggaa ctcagtctct gaatagtaag 84 0 

tcaacggcaa aaaatactca gatctattct ggtggtacgc aaattattga taacaccagc 90 0 

tcctcggatg ttattgaagt ttattccggt ggcgtgcttg atgttagtgg tggtacggca 960 

55 acaaatgtta cccagcacga tggtgcaatt ttaaaaacta acactaacgg tacgacggtg 102 0 

agcggtacga atagtgaagg tgcattctcc atccacaatc acgtggcaga caatgtgttg 108 0 

ctggaaaacg gtggtcattt agacataaac gcatatggtt cggcaaacaa gacgattatt 114 0 

aaagataaag gaacaatgtc agttttaacc aatgctaaag ctgatgcgac ccgaatagat 12 0 0 

aatggcgggg ttatggatgt tgcaggaaac gcgacaaata ccataattaa tggtggcaca 12 60 

60 cagaatatta ataattatgg catagccaca ggcaccaata tcaacagcgg aacgcaaaat 1320 

atcaaaagcg gcgggaaagc tgacacaaca attatatcct ccgggagccg gcaggttgtt 13 80 

gagaaagatg gtacggcaat tggcagcaat attagcgccg gaggctcgct gattgtctat 1440 

accggcggta ttgcacatgg ggttaaccag gagacgggca gtgctttagt tgccaacacg 1500 

ggtgcaggga ctgatatcga aggatacaac aagctctctc acttcactat taccggaggg 1560 

65 gaggctaatt atgttgtgct ggaaaatacc ggcgaactga cggtagtggc taaaacctcg 1620 

gcgaaaaata ctaccattga tgctggcggt aagctgattg tccagaagga ggctaaaaca 1680 

gatagcacca gacttaataa tggcggcgtt ctggaggttc aggacggtgg tgaggctaag 1740 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



60 



65 



catgttgagc aacaatccgg cggcgcatta 
gaaggaacca acagttatgg tgatgctttc 
gtgctggaaa acgctggctc attaacagtc 
attaatgcca acggcaaaat ggatgtttat 
gctggcaccc aaacaatata tgccagtgcc 
aagcaaacgg tatatggttt agccactgaa 
gatggtgggt caacagagaa aacacacatc 
ggtaaggcga tcaataccga tatcgtctct 
gcggaaggtt ccattattaa tggcggttca 
aactcggtgc ttaatgatgg cggcacactc 
atacagcaga gtagdcaggg cgcgttggtt 
acacgcgcgg atggcgtcgc gttcagcatc 
gcaaatggcg gagtgttaac cgtggagtca 
acgggcggac gggagatcgt caaaacaaaa 
ggtgaacaaa ttgtcgaggg tgtggcgaat 
acagtttcag ctaacggaga ggcaataaaa 
gtcaacgata atggcaaagc gacagatatc 
agcacggcta acggtattga aatcagcggt 
ggcaatttag cgaccaatat gttgctggaa 
accgaagctc gcgactccac ggttggcaag 
tccgccacaa aggttaactc tggtgggcaa 
caggctctgg cccgggcaga agate tccag 
ggtacgctgg cggatgcatc ggtcagtggc 
cgggataatg ttacgccagt taaactcgaa 
ttaactatcg gcaatggcgt tgatacgacg 
agtgtctggc ttaacagcaa taattcctgt 
aacagtttgc tacttaacga cggtaatgtt 
acaactaacg gfcatatacaa tacgctgaca 
tacctgcata ccaacgttgc aggctctcgg 
actggtaatt ttaaaatctt tgttcaggat 
atgacgctgg tgaaaacagg gggaggggat 
ttcgttgatc ttgggaccta tgagtatgtc 
ctgaccaatg atgtcaaacc caacccggat 
ccggatccaa aaccagaccc aaaaccggat 
ccgacacccg ttccggagaa acgcatcacg 
gcaacattac cgttggtatt tgatgctgag 
atgaaagcga gtccacacaa caataatgtc 
gtcaccaccg atgcgggggc cgggtttgag 
gacagcccta atgatattcc tgaggggatt 
cattcacata tcggttttga tcgcggagga 
ggctatgcca gttgggaaca tgaaagtggt 
cgttttgaaa gtaacgtagc cggtaaaatg 
cacagcaacg ggctgggcgg tcacattgaa 
aacctgacgc cgtatgcatc gttaacgggg 
tccaatggca tggaatcgaa atcagtcgat 
acgctgagtt acaacatgcg tctggggaac 
gctgtgcgca aagaatttgt cgatgataac 
gtcaatgatt tgtcgggcag acgtggaata 
agtacgttaa gcgggcatct tggggtgggg 
tggaacgcgg tagctggtgt gaactggtcg 
<212> Type : DNA 
<211> Length : 4716 

SeguenceName : SEQ ID 386 
SequenceDescription : 

Sequence 



attgcttcca 
tacatcagga 
gtcactggtt 
ggaaaagatg 
acttctgata 
gcaaatatcg 
aatggtggca 
ggcctacaac 
cagatagtta 
gatgtgcggg 
gcaaccacca 
gagcagggtg 
gacacctctt 
gccactgcga 
gagacaacaa 
acaacgatca 
gtccagaaca 
actcaccagt 
aatggcggta 

gggggggcaa 

tatacccttg 
gttgctggcg 
gcgacaggaa 
ggggcgatcc 
cttgccgacc 
gcaggcacca 
tatttatcag 
accaatgaac 
ggcgatcaac 
accggcgtca 
gcttcgtttt 
ctgaaaagcg 
cccaacccaa 
ccgaaaccag 
ccttctaacg 
ctaaacagta 
tggggggcga 
cagacgctga 
gcgacgctgg 
catggcagtg 
ttctatctgg 
agcagcggtg 
accgggatgc 
ttcaccgctg 
acccgcagta 
ggtatggaaa 
cgggtgaagg 
taccaggcag 
tatagccatg 
ttctga 



cgacctccgg 
atfccagaagc 
cccgggcagt 
ttggcactgt 
aagcaaatafc 
aaagtggtga 
cgcaaaccgt 
aaattatggc 
atgagggcgg 
agaaaggcag 
gggcgacgcg 
cggcgaacaa 
ctgacaaaac 
cr,ggcacgac 
ttaacgacgg 
atgaaggcgg 

gcggtgccgc 

acggcacttt 
atttattggt 
tgcaaaacca 
ggcggtcaaa 
ggacagcaat 
gcctgtcgtt 
ggattaccga 
tgacggctgc 
gcaactgcga 
cacaaacagc 
tttccggtag 
tggtcgtcaa 
gtcctcagtc 
cgctgggcaa 
atggcaacag 
atcccaaccc 
acccgactcc 
cagccgtact 
ttcgcgagcg 
cgfcataacac 
ccggaatgac 
gcgcttttat 
tgggcagtta 
acggtgtcgt 
gagccgccaa 
gatttaccga 
afcaaccccga 
tatatcgtga 
ttgagccgtg 
tgaataatga 
gtattaaagc 
gtgccggtgt 



aacacttatc 
taaaaatgta 
tgacacgatt 
actcaatagt 
caaaggtggc 
acaaattgtt 
tcagaafctat 
aaacgggaca 
tctggctgaa 
cgcaacgggg 
ggtcacagga 
tatcctgctg 
acaggtcaat 
•gctcaccggc 
cggaatacaa 
tacgctgaca 
tctccagacg 
ttccatttcc 
attagcaggt 
gggtcaggac 
agatgagttt 
cgtctacgca 
aatgacgcca 
tagcgcgaca 
cagccggggc 
gtatagagta 
agcgcctgcc 
cggtaatttc 
caacaacgcc 
tgacgacgcg 
tactggcggt 
caactggaac 
aaatccgaag 
cgagccaacg 
caatatggca 
gttgaacata 
ccgtaataat 
agtggggatc 
gggttattcc 
ttctctgggc 
gaagctgaac 
tggcagttac 
tggtaactgg 
atatcattta 
actgggcgca 
gctgaaggcg 
cggtaatttc 
ctcattcagc 
ggaatccccg 



1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 

3420 

3480 

3540 

3600 

3660 

3720 

3780 

3840 

3900 

3960 

4020 

4080 

4140 

4200 

4260 

4320 

4380 

4440 

4500 

4560 

4620 

4680 

4716 



<213> OrganisraName : Escherichia 
<400> PreSequenceString : 
atggcagtaa agatttcagg tgtactgaaa 
accattcaac tgaaagccag acgtaacagc 
gaaaatccgg atgaagccgg tcgttacagc 
attctgttgg tggaagggtt cccgccgtca 
tctcaacccg gtacgctgaa tgattttctt 
gaggcactgc gccgttttga gctgatggtg 
gcacagaaca cggcagccgc gaagaagtca 
gcggcaaccc atgcgactga tgctgcggac 



coli 0157 :H7 

gacggcacag gaaaaccggt 
gccacggtgg tggtgaacac 
atggacgttg agtacggtca 
catgccggga. ccatcaccgt 
ggtgccatga ctgaggatga 
gaagaggtgg cgcgtaacgc 
gccagcgatg ccagcacatc 
tcagcacgcg cagccagcac 



agagaactgc 
ggtggcctct 
gtacagcgtt 
gtatgaagat 
tgtccgtccg 
gtccgcggtg 
agcccgtgag 
gtcagccgga 



60 
120 
180 
240 
300 
360 
420 
480 
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caggccgcgt cgtcggctca gtcagcgtct tccagcgcag gaacggcatc aacaaaggct 540 

actgaagcat caaaaagtgc tgccgctgca gagtcctcaa aaagcgcggc ggctaccagt 600 

gccggtgcgg cgaaaacgtc agaaacgaat gcggcagtgt cacaacaatc agccgccact 660 

tctgcatcca ccgcgaccac gaaagcgtca gaagctgcct cctcagccag ggatgcgtcg 720 

5 gcttcaaaag aggcggcaaa atcatcagaa acgagcgcag cctcgagcgc cagtagtgca 78 0 

gcctcctcgg caacggcggc aggcaattcc gcgaaggcgg ccaaaacgtc tgagacaaac 840 

gctaagtcct ctgaaacggc agcagaacag agtgcctccg cagcagcagg ctcaaaaaca 90 0 

gcggctgcat tatctgccag tgccgcgtca acaagtgccg ggcaggcctc agccagtgcc 960 

accgccgccg gaaaatcggc agaaagtgcc gcatcgtctg cttcaacagc cacaacgaag 1020 

10 gctggcgaag ccactgaaca ggccagcgca gcagcgagtt ctgcttccgc agcgaagaca 10 8 0 

tccgaaacga acgcgaaagc gtcggaaacc agcgcagaat cctGaaa.aac ggctgccgca 1140 

tcgtcagcca gttcggcggc gtcatcggca tcatctgcgt ctgcttcaaa agatgaggcg 12 00 

accagacaag cgtcagcagc gaagagcagc gccacgacgg catccacgaa ggcgacagag 1260 

gcagctggta. gtgcgacggc agcagctcag agcaaaagta cggcggaatc tgcagcaacg ,13 20 

15 cgcgctgaga cagcggcaaa acgggcagag gatattgcat ccgccgtggc gcttgaggat 13 8 0 

gcgagcacga cgaaaaaggg gatagtacag ctcagcagtg cgactaacag cacttccgag 1440 

tcactggcgg caacgccaaa agccgttaag gccgcgtatg agctggctaa cgggaaatac 150 0 

accgcacagg atgcaacgac agcacagaaa gggatagttc agcttagcaa cgcgaccaac 1560 

agcacatctg aaatgctggc ggcaacgcca aagtcggtaa aggcagccta tgaccttgct 162 0 

20 aacgggaaat atactgctca ggacgctacg acagcacaaa aaggaattgt ccagctcagt 1680 

agtgcaacca acagcgcatc tgaaacgctt gccgcgacac cgaaagcagt gaaagcagct 1740 

aatgataatg cgaatggtcg ggtaccttct gcccgtaagg tgaatggtaa ggcgctttca 18 00 

tcggatataa cacbgacgcc gaaagatatt ggtacgctta actcaacaac aatgtcattc 1860 

agcggtggtg ctggttggtt caaattagca acggtaacca tgccacaggc gagttctgtt 192 0 

25 gtttcaatta cgttgattgg tggcgcggga tttaacgtgg ggtcacctca acaggcaggt 198 0 

atatctgaac ttgttttgcg tgcaggtaat ggtaatccga aggggattac tggtgcttta 2040 

tggcagcgca catcgacagg gtttacaaat tttgcctggg tcaatacatc tggtgatact 2100 

tacgatattt acgttgcaat cggaaattat gcgactggtg taaatattca atgggattat 2160 

accagtaatg ccagcgtgac gattcatacg tcaccagcat attctgctaa taagccggaa 2220 

30 gggttaacgg acggtacagt ttattcactc tatacgccat cagagca.gtt: ttatccgcct 22 80 

ggcgcaccaa tcccgtggcc atcagatacc gttccgtctg gctatg-ccct gatgcagggg 234 0 

cagacttttg acaaatctgc atacccgaaa cttgcagccg cttatcrcgtc aggcgtgatc 240 0 

cctgatatgc gtggctggac gattaagggc aaacctgcaa gtggtcgggc cgtattgtcfc 2460 

caggaacagg acggcattaa atcgcacacc cacagcgcca gcgcatccag tacggatttg 2520 

35 gggacgaaaa ccacatcgtc gtttgattac ggcactaaat ccacgaataa caccggggcg 2580 

cacacgcaca gtgtgagcgg tacagccgca agtgccggaa accatactca tagtgtcaca 2640 

ggcgcatcag cagtcagcca gtggtcacaa aatgggtcag tacataaggt agtgtctgcg 2700 

gccagtgtga atacaagtgc tgcaggagcg cacactcata gtgtcagcgg cacagctgca 2760 

tctgcaggtg ctcacgcaca tactgtcggt attggtgctc atacgcactc tgttgcgatt 282 0 

40 ggctcacatg gacacaccat caccgttaac gctgcgggta acgcggaaaa cactgtcaaa 2880 

aacatcgcat ttaactacat tgtgaggctt gcataa 2916 
<212> Type : DNA 
<211> Length. : 2916 

SeqaenceName : SEQ ID 387 

45 SequenceDescription : 

Sequence 



<213> OrganisHiName : Escherichia coli 0157 :H7 

50 <400> PreSequenceString : 

atgaaacgag ttattaccct gtttgctgta ctgctgatgg gctggtcggt aaatgcctgg 60 

tcattcgcct gtaaaaccgc caatggtacc gctatcccta ttggcggtgg cagcgctaat 12 0 

gtttatgtaa accttgcgcc tgccgtgaat gtggggcaaa acctggtcgt agatctttcg 18 0 

acgcaaatct tttgccataa cgattatccg gaaaccatta cagactatgt cacactgcaa 240 

55 cgaggctcgg cttatggcgg cgtgttatct aatttttccg ggaccgtaaa atatagtggc 3 00 

agtagctatc catttccgac caccagcgaa acgccgcggg ttgtttataa ttcgagaacg 3 60 

gataagccgt ggccggtggc gctttatttg acgcctgtga gcagtgcggg cggggtggcg 420 

attaaagctg gctcattaat tgccgtgctt attttgcgac agaccaaaaa ctataacagc 480 

gatgatttcc agtttgtgtg gaatatttac gccaataatg atgtggtagt gcctactggc 540 

60 ggctgcgatg tttctgctcg tgatgtcacc gttactctgc cggactaccc tggttcagtg 600 

ccaattcctc ttaccgttta ttgtgcgaaa agccaaaacc tggggtatta cctctccggc 660 

acaaccgcag atgcgggcaa ctcgattttc accaataccg cgtcgttttc accagcgcag 72 0 

ggcgtcggcg tacagttgac gcgcaacggt acgattattc cagcgaataa cacggtatcg 780 

ttaggagcag taggaacttc ggcggtaagt ctgggattaa cggcaaatta cgcacgtacc 840 

65 ggcgggcagg tgactgcagg gaatgtgcaa tcgattattg gcgtgacttt tgtttatcaa 90 0 

taa 903 
<212> Type : DNA 
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<211> Length : 903 

SeguenceName : SEQ ID 388 
SequenceDescription : 

Sequence 



10 



15 



20 



25 



<213> OrganismName : Escherichia 
<4 00> PreSequenceString : 
atgggctacg ttacaggtgg attaccaatg 
ggtctgatat tgttcagcgg aacggcccca 
ttgcfctggta aatcctgtac tcctgtaatc 
cccacaattg ctgccagcga tttaatgcaa 
tttcagttga aagattgcaa aagcaccacg 
acagaagata ccgacttacc- aggatttpctg 
gttgggattg gcattgaaac tgccggaggg 
gcctcatttc cattaaatca gggaaataac 
gtaaatggac gaaatgttac atcgggtgat 
tatttttaa 
<212> Type : DNA 
<211> Length : 549 

SequenceName : SEQ ID 389 
SequenceDescription : 

Sequence 



coli 0157:H7 

aagaataacc gtgcgtgggc 
gctgccgata acctgcattt 
aatggcaact tacttgcaga 
cgtggtcagt cagatcgcgt 
gcgtttaatg tcaaggtgac 
tcgattgatt cgtcatcttc 
gcggctgtac ctattaacag 
agtgtcaatt ttaatgcctg 
ttcaccgcca caatgacggt 



gcttatcagt 
taccggtaat 
aattcatttc 
accgttagtt 
cttgatggga 
tgcaacgggt 
taccacaggt 
gttacagacc 
aacttttgag 



<213> Organi smName : Escherichia 
<400> PreSequenceString : 
atgaaacgac atctgaacac cagctacagg 
gtggtggcct ccgaactggc ccgctcacgg 

30 tctcttgctg ctgtcacatc agtcccggca 
gaaaccgtga acgatggaac actgacaaat 
aacggaatga ccatcagtac cgggctggaa 
gggcaatgga tacagaatgg cgggatagcc 
caggtcgtgc tggagggggg aacagccagt 

35 agcctgaacg gactggcggt gaacaccaca 
gagggcgggg ttgccaccgg tacaattatc 
ggcgggctgg caacaggaac catcatcaac 
aactcgtata cgggtcagaa ggtccaggga 
ggacggcaga ttatcttatt ttccgggcta 

40 gaccagtcgg tacacggaag ggccctgaat 
cacagggacg gacttgcgct gaacacggta 
gcaggtggcg ctgccggtaa caccaccata 
ggcggggaag ccactgcagt cacccagaac 
gcaactgtca tcggcacaaa ccgtctgggg 

45 ggtgttgttc tggaatccgg cggtcgtctg 
accctagtgg atgacggcgg taccctggca 
accataacat ccggtggtgc cctgattgca 
gccagcggta agttcagtat tgatggcaca 
aatggcggca gctttacggt taatgccggg 

50 cgtggaacac tgacgctggc tgccggggga 
ggcgccagta tggtactgaa tggtgatgtg 
gagattcgct ttgataatca gacgacaccg 
agtaactccc cggtaacgtt ccataaactg 
accatcaata tgcgtgttcg ccttgatggc 

55 ggtggtcagg caaccggcaa aacctggctt 
ggggtggcaa ccaccggaca gggtatccgg 
gaagaaggtg cgtttgccct gagtcgcccg 
aaccgtgaca gcgatgaaga ctggtacctg 
cccctgtata catccatgtt gacacaggca 

60 cgcagccatc agaccggtgt aaacggtgaa 
ggtcatctcg gtcacgataa caacggcggt 
ggcagctatg gcttcgtccg tctggagggt 
tctctgacga caggggtgta tggtgctgca 
gacggttccc gcgccggcac ggtccgggat 

65 ctggtacaca catcctccgg cctgtgggct 
atgaaagcgt catcggacaa taacgacttc 
ctggaaaccg gtctgccctt cagtatcact 



coli 0157 :H7 



ctggtatgga 
ggaaaacgcg 
ctggctgctg 
catgacaacc 
ctggggccgg 
ggaaacacca 
gatacggtta 
ctgaataaca 
aaccgcgacg 
accggcgcag 
acagcagaat 
gcccgtgaca 
accacactga 
attaacgagg 
aatcagaacg 
acgggcggtg 
aatttcacgg 
gatgtactgg 
gtgtctgccg 
gacagtggtg 
tccggtcagg 
ggacaggctg 
agtctgagtcf 
gtcagtaccg 
aatgccgcgc 
accaccacga 
agcaatgcct 
gcgtttacaa 
gttgtggatg 
cttcaggccg 
cgcagtgaaa 
atggactatg 
aataacagcg 
attgcccgtg 
gacctgctca 
ggccattctt 
gatgccggca 
gacattgtgg 
cgcgcccggg 
gacaatctga 



atcacattac 
ccggtgtggc 
acaaggttgt 
agattgtctt 
acagtgaaga 
ctgtcaccac 
ttcgtgacgg 
gaggcgagca 
gttaccagag 
aaggcggccc 
ccaccaccat 
ctctcattta 
atggcggtta 

ggggctggca 

gtgaactgag 
cactggttac 
tggaaaacgg 
agagccattc 
gcggtaaggc 
ccactgttga 
ccagcggcct 
gcaacaccac 
gcagaacaca 
gcgatattgt 
tgagccgtgc 
acctcaccgg 
ctgaccagct 
atgtcggaaa 
cacagaatgg 
gcgcctttaa 
atgcttatcg 
accggattct 
tccgtctcag 
gagccacgcc 
gaacagaggt 
ccgttgatgt 
gtctgggcgg 
cccagggaac 

gcfeggggctg 

tgctggagcc 



gggcaccctg 
ggttgcgctg 
acaggcggga 
cggtacggcc 
aaacaccggt 
aaatggtcgt 

cgggggacag 

gtgggtgcat 
cgttaaaagt 
tgattctgac 
caacaaaaat 
cgcaggtggt 
ccaatatgtg 
ggttgttaag 
ggtacatgcc 
cagtactgct 
taaggctgac 
agcacagaat 
gacaagtgtc 
ggggaccaat 
gctgctggaa 
tgtcggacat 
gctcagtaaa 
taacgcaggg 
tgttgcaaaa 
ccagggcggc 
ggtgattaat 
cagcaacctc 
cgccaccaca 
ctacaccctg 
tgctgaagtc 
ggcaggctcc 
cattcagggc 
ggaaagcagc 
tgccggtatg 
taaggatgat 
atacctgaat 
ccgtcacagc 
gctgggctca 
acaactgcag 



60 
120 
180 
240 
300 
3 6,0 
42 0 
480 
540 
549 



60 
120 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
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tacacctggc agggactctc cctggatgac ggccaggata acgccggtta tgtg^aagttc 2460 

gggcatggca gtgcacaaca tgtgcgtgcc ggtttccgtc tgggcagcca caacgatatg 252 0 

acctttggtg aaggcacctc atcccgtgac accctgcgcg acagtgcaaa acacagtgtg 25 8 0 

agtgaactgc cggtgaactg gtgggtacag ccttctgtta tccgcacctt cagctcccgg 2640 

5 ggtgacatga gcatggggac agccgcagcc ggcagtaaca tgacgttctc accgtcccgg 270 0 

aatggcacgt cactggacct gcaggccgga ctggaagccc gtatccggga aaa.tatcacc 2760 

ctgggcgttc aggccggtta tgcccacagc gtcagcggca gcagcgctga aggrcfcataac 282 0 

ggtcaggcta cgctgaatat gactttctga 2850 

<212> Type : DNA 
10 <:211> Length : 2850 

Sec[uenceName : SEQ ID 390 
SeguenceDe script ion : 



Sequence 
15 

<213> OrganistnNarae : Escherichia 
<400> PreSequenceString : 
atgaaaaaat ggcattatat attttgcata 
tatgcggcaa atgatggcac gtgtgcaaca 

20 tttcctctga caacggtcag tgcagcaaac 
gctaatgcaa catcttctga aaattatagc 
aatggcgctt atcacgaaat atattatacc 
accaccgcaa gtggtcttgc tttttactat 
atatctgtgc taaatgcggg gtatacggca 

25 actacaacag atcacacttg tcagggaaac 
actggagcag atgcgaagat ttcatttcgt 
atacctatca ccgatattgc attgctgtat 
gaggcgattg caaaagttcg aatttcaggc 
aatgcaggac aggtgattta ttttgatttt 

30 accgccgggc aagccattac ttcacgaaaa 
gggatggggt atgagcgtac gcagaaagtc 
agtgacgata cgatggtggc gacagacaat 
tcgaatgctg aagttagcgt caacaacggc 
atttttggtc gtaaaaatgg ttcggtaact 

35 gcccggcctc agcccggcgt ttttaacgct 
taa 

<212> Type : DNA 
<211> Length ; 1083 

SequenceName t SEQ ID 391 
40 SequenceDescription : 



COli 0157 :H7 

attctctttc atttagggtt accgtgcggg 60 

agaggcggca cacatacatt aagccttaat 120 

aatgtgcctg gaaatacatt aatagatatt 18 0 

gttctgtgta actgtgattc aaa.a.catagc 240 

gcagaccctg ctcccggtat ggtttatagc 3 00 

cttaacgaat atgtcgatgt ggg-a.acaaaa 3 60 

gttccttttg aacatgtttc caa.ccaggca 420 

aaaactacag cggttggcgt gagrcctgaaa 480 

attaaacgtt caataaatgg aacggtagta 540 

gccaacatat ccagcaccac gacccgtggt 60 0 

agtttgaccg caccacagtc ttg-tcagata 660 

gatactattc ctgcgtccga attittcatct 72 0 

atcactaaaa cagtgagtat tgagtgtacg 780 

gatgcttctt ttacggggac gaaccgaagc 840 

gctgatgtcg ggatcaaaat ttacaataaa 900 

aagttacccg cagacatggg caacacgacc 960 

ttttcggcag cacctgccag ctttaccggt 1020 

accgcgacct taaccattga atttgtaaac 1080 

1083 



Sequence 



<213> OrganismName : Escherichia 

45 <400> PreSequenceString : 

atgtcacgtt ataaaacagg tcataaacaa 
tgcgtggcgt gggcaaatat ctctgttcag 
ccagtaatgg cggcacgtgc gcagcatgcg 
acggtaactg ctgataataa cgtggagaaa 

50 acatttttaa gcagtcagcc agatagcgat 
accgctaaag ctaaccagga aatacaggag 
aaactgaatg tcgataaaga tttctcgctg 
atttatgata cgccgacaaa tatgttgttc 
cgtactcagt caaatattgg ttttggctgg 

55 ggggtgaaca cctttatcga ccatgattta 
gcggaatact ggcgcgatta tctgaaactg 
tggaaaaaat cgccggatat tgaggattat 
cgcgcagagg gctatttacc tgcctggccg 
tattatggcg atgaagtcgg gctgtttggt 

60 atttctgccg aggtgaccta tacgccagtg 
cagggcaaga gcggtgagaa tgacactcgc 
gaacctttgg cgaaacaact cgatacggat 
agccgctatg acctggttga gcgtaataac 
gtgatccgta ttgctctgcc tgagcgtatt 

65 gggcttgtgg tcagcaaagc aactcacgga 
ttactggctg aaggtggcaa aattaccggt 
gcttatcgtc caggcaaaga caattattat 



coli 0157 :H7 

ccacgatttc gttattcagt tctggcccgc 60 

gttctttttc cactcgctgt caoctttacc 120 

gttcagccac ggttgagcat gggaaatact 180 

aatgtcgcgt cgtttgccgc aaatgccggg 240 

gcgacacgta attttattac cggaatggcc 3 00 

tggctcggga aatatggtac agcgcgcgtc 3 60 

aaggattctt cgctggaaat gctttatccg 420 

actcaggggg caatacatcg tacagacgat 480 

cgtcattttt caggaaatga ctggatggcg 540 

tcccgtagtc atacccgcat tggtgttggt 600 

agcgccaatg gttatattcg ggcttctggc 660 

caggaacgcc cggcgaatgg ttgggatatc 72 0 

cagcttggcg caagcctgat gtatgaacag 780 

aaagataagc gccagaaaga cccgcatgct 840 

cctcttctga cactgagcgc cgggcataag 9 00 

tttggcctgg aagttaacta ccgaattggc 960 

agcattcgcg agcgtcgggt actggcaggc 1020 

aacatcgttc ttgagtaccg caaatctgaa 1080 

gaaggtaagg gcggtcagac actttccctg 1140 

ctgaaaaatg tgcagtggga agcgccgtca 1200 

cagggtagtc agtggcaagt aacgctcccg 1260 

gcgatttcag cagttgccta cgataacaaa 1320 
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50 



55 



ggcaatacct caaaacgcgt gcagacagag gtggtcatta ccggagctgg tatgagcgcc 138 0 

gatcgcacgg cgttaacgct tgacggtcag agccgtattc aaatgcttgc taacggtaat 1440 

gagcaaaaac cgctggtgct gtctctgcgc gacgccgagg gccagccagt cacgggcatg 1500 

aaagatcaga tcaagactga actaactttc aaaccggctg gaaatattgt gactcgttcc 1560 

5 cfcgaaggcca ctaaatcaca ggcaaagcca acactgggtg agttcaccga aactgaagca 1620 

ggggtgtatc agtctgtctt tactaccgga acgcagtcag gtgaggcaac gattactgtt 1680 

agcgttgatg gcatgagcaa aaccgtcact gcagaactgc gggccacgat gatggatgtg 1740 

gcaaactcca ccctgagcgc taacgagccg tcaggtgacg tggttgctga tggtcagcaa 1800 

gcctatacgt tgacgttgac tgcggtggac tccgagggta afcccggtgac gggagaagcc 1860 

10 agccgcttgc gatttgttcc gcaagacact aatggtgtaa ccgttggtgc catttcggaa 192 0 

ataaaaccag gcgtttacag cgccgcggtt tcttcgaccc gtgccggaaa cgttgttgtg 1980 

cgtgctttca gcgagcagta tcagctgggc acattacaac aaacgctgaa gtttgttgcc 2040 

gggccgcttg atgcagcaca ttcgtccatc accctgaatc ctgataaacc ggtggttggg 2100 

gggaeagtta gggcaatctg gacggtaaaa gatgcctatg acaaccctgt gaccagcctc 2160,„ 

15 acgccggaag cgccgtcatt agcgggtgcc gctgctgaag gttctacggc atcgggctgg 2220 

acaaataatg gtgatgggac gtggactgcg cagattactc tcggctctac ggcgggtgaa 228 0 

ttagaagtta tgccgaagct aaatggacag aatgcggcag caaatgcggc aaaagtaacc 2340 

gtggtggctg atgcgttatc ttcaaaccag tcgaaagtct ctgtcgcaga agatcacgta 240 0 

aaagccggcg aaagcacaac cgtgacgctg gtggcgaaag atgcgcatgg caacgctatc 2460 

20 agtggtcttg cgttgtcggc aagtttgacg gggaccgcct ctgaaggggc gaccgtttcc 2520 

agttggaccg aaaaaggtaa cggttcctat gttgctacgt tgactacagg tggaaagacg 2580 

ggcgagcttc gcgtcatgcc tctcttcaac ggccagccag cagccaccga. agccgcgcag 2 640 

ttgacggtca tcgccggaga gatgtcatca gcgaactcta cgcttgttgc ggacaataag 27 00 

gctccgaccg tcaaaacgac gacggaactc accttcaccg tgaaggatgc gtacgggaac 2760 

25 ccggtcaccg ggctgaagcc agatgcacca gtgtttagcg gtgccgccag cacggggagt 282 0 

gagcgtcctt cagcaggaaa ctggacagag aaaggtaatg gggtctacgt gtcgacctta 2880 

acgctgggat ctgccgcggg tcagttgtct gtgatgccgc gagtgaacgg ccaaaatgcc 2940 

gttgctcagc cactggtgct gaacgttgca ggtgacgcat ctaaggctga. gattcgtgat 3 000 

atgacagtga aggttaataa ccaactggct aatggacagt ctgctaacca. gataaccctg 3 060 

30 accgttgtgg acacctatgg taacccgttg caggggcagg aagttacgct gactttaccg 3120 

cagggtgtga ccagcaagac ggggaataca gtaacaacta atgcggcagg taaagcggac 3180 

attgagctta tgtcaacggt tgcgggagaa cacaatattt ccgcttcggt gaatggtgct 3240 

cagaagacgg tcacggtgaa attcaacgcg gatgccagca ccggtcaggc aaacctgcag 3300 

gtagacgccg ctgctcaaaa agtggcaaac ggcaaagatg cctttacgct gacggcgaac 3360 

35 gttgaggata aaaatggtaa ccctgttcca gggagcctgg tgacctttaa tctgccccgg 3420 

ggtgtcaagc cgcttacagg cgataatgtc tgggtgaaag ccaacgatga ggggaaagca 34 80 

gagttgcagg tggtttcagt gactgccgga acgtatgaga tcacggcatc ggcagggaat 3 540 

agccagcctt cgaatacgca gactataacg tttgtagccg ataaggctac cgcaaccgtc 3600 

tccggtattg aggtgattgg caactatgca ctggcggacg gcaatgccaa acagacgtat 3660 

40 aaagttacgg tgactgatgc caataacaac ctgttgaaag atagcgaagt gacgctgact 3720 

gccagcccgg caaatttagt tctgactccc aatgggacgg cgaaaactaa. tgagcaagga 378 0 

caggctattt tcaccgccac gaccactgtc gcagcgaaat atacactcac ggcgaaagtg 3 840 

agtcaggccg acggtcagga atcgacgaaa actgccgaat ctaaattcgt cgcggatgat 39 00 

acaaatgcag tactcaccgc atcatctgat gtgacttctc tggtggcgga. tgggatatcg 3960 

45 actgcgaagc tggaggtgac actgatgtcg gcaaataacc ccgttggggg gaatatgtgg 402 0 

gtcgacatta agacgccaga aggggtgacg gagaaggatt atcagttcct gccgtcgaaa 408 0 

aatgaccatt tcgtgagcgg aaaaatcacg cgtacattta gtaccagcaa gcctggtgtc 4140 

tatacgttca catttaacgc cctgacgtat ggcgggtacg aaatgaagcc agtgacggtg 42 00 

accattaccg cggtggatgc cgatacggca aagggcgagg aggcgatgaa. ctaa 4254 

<212> Type : DNA 
<211> Length : 4254 

SequenceName : SEQ ID 392 
SequenceDescription : 

Sequence 



<213> OrganismName : Escherichia coli 0157 :H7 
<400> PreSequenceString : 

60 atggcgcgtg gttgggcgtc ttcagaagcc tcaggcgcga tgactgattg gttaaataac 60 

tttggtactg cgagaatctc tctgggtgtg gatgaagatt ttagcctgaa aaattcgcaa 120 

ttcgacttcc tgcatccgtg gtatgacaca cctgattatc tgctcttcag ccagcatacc 180 

cttcaccgaa cagacgatcg tacccagatc aacaccggtt tgggctggcg tcatttcacc 240 

tccagctgga tgtcaggcat caaccttttt tttgaccacg acctgagcccf ctatcactcc 3 00 

65 cgcgcagggc ttggcgcaga atactggcgt gattatctga agttgagcag caacgcttat 3 60 

atcggcctga ccggctggcg tagcgcacca gaattggata acgacttcga agcccgcccg 420 

gccaacggct gggatttacg cgcggaaggc tggttacctg cctggccaca actgggggga 480 
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aaactggtct atgaacaata ctatggcgat 
caaagtaacc cccatgctat taccgcaggc 
ctcagtgcgg aacagcgtca ggggaagcaa 
ctgacctggc aacccagcag ttcaatgcag 
5 cggcgcagtc tggccggtag tcgttatgac 
gaataccgca agaaagagct gattcgcctg 
ggagaaataa aaccgctggt ttcctcgcta 
atcgaagccg ctgcgctgga agctgccgga 
acggtcacgc tgccaggtta ccgcttcact 

10 atagacgtta ccgccgagga tgjtaaaaggt 
gttattcagg ctccgacatt aagccagaaa 
gtggctgcag ataaaaaatc gacgaccaca 
actccggtgc cggggctggc gctgcaaacc 
tctgactgga .cagataacgg tgatggtagt 

15 tcaggttcag taacactgac gccgcaaatt 
gtcgttaata tcgtccctgt tgtctcatcc 
gtatcgtatt atgccggaga cgacatcaag 
caaccggttg catatcaaaa agaggaattg 
cctggcgcca cgattgtctg gcacgaagag 

20 gcctataagc aagggactgc actaagggca 
ctgcaatcgc atatttataa cattgaggca 
tcagcgacaa ataatgacgt ttacgccgat 
gtcactgatg agagtgataa tcccctgaca 
ggaagcgcgg agtttgtcga accgccgcag 

25 ataaacatgg taagtcaggt tgcggaagaa 
ttttcacaac ggataattgc gaaattcgtt 
ctggttgccg atccagatac cattattgct 
atcatcacag actfctcataa caacccgtta 
ggtggctcgc aactggacaa cacgaccgcc 

30 cacctgacca gttcaaaagc tggtagctat 
aatattcacc agtcggtcac gatcaccgtg 
ttgaatgccg ggtcgggcag tgcgatcgct 
agtgtgaaag atgtttatgg acacccgttg 
gcctccatga ccgggaactfc cacgctaagt 

35 gatgccgtgg tcacattgcg aggcacaaaa 
accagaaata ataccgttgc ttatcagcaa 
cagctccagc cgctgactgc ctcattaaat 
accctgacgg caacgatcct ggacgcttac 
ttccagagta acgatgtcac tctaagcgaa 

40 gcgacggtaa caatgaccag caatattgcc 
gcgcaagctt ccgataataa aacgtttagt 
aaggtaataa gtataaccgg agccgaaaaa 
cggatactcg tccaggacgc gtttaacaat 
gcgcagccaa caactaacat tacgataggc 

45 gcgtacgtta accttctcag cacccaacct 
aataacagta gtagtaaggt tgacgtgaat 
tcgaaaccag aaactacggt ccataatagt 
aatgcgcggg gtgaattgat gccagggcaa 
gcaacgctaa gcaatacagg ggaagtcctt 

50 ctgaccagtg acaaagtgaa tgtctatacc 
gttcagagcc aggtaacggt tgcggttaag 
gtcgtggctt ctcctgacac catcaccgcc 
cgagtagaag atgattacgg attcccggtt 
accaaaggca gcccggtagt taatattcca 

55 acggcgacaa taaccagtac attggcagaa 
acagccaacc aatccgcaac cattacattg 
attttgaaat ccgatgttga cactctgaag 
ctaacattgc aagacaagta cggtaacccg 
cagtcaggcG ccttcgtgaa ctttctcaag 

60 tatggcgagt acaccgtgac tgtcactggc 
atgctgaacg gggttcatca ggcaaactta 
aaagaaatgt ccggtcatgt cactgcaaac 
agcgaaggct ttgcaggagc gtattacaca 
accgttgatg attatatgtt ttcaagttca 

65 aaagtttctt tcgcaaatat cggcgatcaa 
caaggaggta caacctacca gaccttaatt 
aatcatacca atatctggct agctgccaat 



gaagtggcgc tgtttgacaa gaatgatcgt 540 

ctcaactata cccccttccc gcttctgact 6O0 

ggtgaaaatg acacacgttt tgccgttgat 560 

aaacagctta atccggacga agtggccgga 720 

ctgattgatc gcaacaacaa catcgttctg 780 

agtctgctgg atccggtgaa agggaagtct 840 

cagaccaaat atgcccttaa aggctataac 900 

ggtaaagtca gcacgtctgg aaaagatatc 9 SO 

aacaccccag aaaccgataa tacatggtcg 1020 

aacctgtcac ggcatgaaca aagcatggta 10 80 

gattctctgt tatccgtcaa tccgctaacc 1140 

ttgaccgtta ctgcgcacga ttccgacgga 12 OO 

cgcagtgaag gcgttcagga tatcaccctg 12 60 

tacacacaga tactgaccgc cggaacgaca 13 20 

aacggtgaga gtgcggtaaa agaatccatc 13 80 

cgcgaccatt catcaataac aattgataac 1440 

gttagggtgg aactgaaaga cgatagcaat 15 OO 

gtaaaagccg ttactgtcga aaacagcaaa 15 60 

cagccggggg tttatgccgc gaattatccg 1520 

caacttagcc ttcacaactg gaatgctcca 168 0 

aaccagaata aggctcgcgt tgccacatta 1740 

aaaaagacat ttaataccct cacgatcaac 18 00 

aatcatcagg tcacctttaa gaatgaaaaa 18 60 

caaaatacgg atgcatatgg tgttgccaca 1920 

aatacgatta gcgccacgct gccaaatggt 19 80 

agcgattcga gtacgccaaa attcaaacaa 2040 

ggcaacagcc agggcagtac tctgaccgcc 2100 

aaagatatga aagtgaattt tgtggcacct 2160 

acaacagacc agtccggtat tgtgcgggtg 22 20 

tccgtcgatg cctcgcttga ggtggataaa 22 80 

gtcccaaaca gggaacaatc ggtaatgacc 23 40 

aacaatacaa atatcgttac cctgactgcc 24 00 

ccggatgagg atgtgaaatt taccttgcca 2460 

agtgaaaccg cccgcaccga tgcaaacggt 2520 

gcgggtgagt ttacagttac ggcgacgctg 2580 

gtcactttta ttggggatac aaacagtgcg 2640 

tccattgttg cgggtaacag tacggggagt 27 OO 

Cciaaatccgc ttaaagacca gttggtcact 2760 

acagaagtca ccaccaatac gctgggtcag 28 20 

ggacaacata acgtcgtggt gagccggaaa 28 80 

ttatcagtgc taccggatga aagttcggcg 294 O 

acgataacgg tgggcgaaaa catcacgcta 30 OO 

gtaatcgcgg gtcaacgcgt cagattaagt 3 0 60 

gatacggctt acaccgataa taacggttat 3120 

ggggtttatc aggtgacggc aacgctggac 3180 

gtggcaaatg gcaaactcga gttaacatca 3240 

gagggtatta cgctgaccgc aacggcgaga 33 OO 

attatcacct ttagcgtaac gcctgaaggt 33 60 

actgaccagt caggtcaggc caaagtgacg 342 O 

gttacggcca taatgggcaa agatgttccc 34 BO 

gcagatgcta aaacggcaca tgttgtgagc 3 54 0 

gacggcatcg atagcagcac catcacttca 3 6 0O 

gaaggtgtcg atattagtca tggcttagac 3 650 

actacgcgta ccgatcagtc cgggcaagtc 3 720 

accttaacag tcaatgtgca agttcctggc 37BO 

gttgccggca cggccgatga aagtaagtca 3 840 

gctgactacc agcagagcgc aaaacttacg 39 OO 

atagtgacgt ctgatcatct ggaatttgtc 3960 

ttgagcgata ttgattacag ccaaagaaat 402 O 

ggaaaagagg gaacagcgac actcattccc 40 8 O 

agcatatcgc tgaatctcat ccaatcgata 414 0 

aaccatacct tctccacggc taaattcccg 42 OO 

ctcaacaatg ataactttga agcgggtaaa 42 60 

cagggttggg tgtctgtcga tgcttcgggt 43 2 O 

acgtcagtca caataagcgc tgttccccga 43 BO 

aagctgaaag gctggtgggt gaataatgga 444 O 

gcgctctgtc atgctaaaaa tgatggatat 450 O 
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aatcttcctg gcatcacaca tttgacgtct ggcgaaaaca 
tatggtgaat gggggaacgt tggagcgttt tccagtaatt 
tactggacaa gtgaatctga tgattacagt cggcactact 



aacgcacgca gggatcactg 
cgcaatttac accgggagct 
atgtgcagat gctaaccggt 



atgaccggaa gcgacgctga ttccagcccc caactgaccg cctgccgtaa atcactttaa 



4560 
4620 
4680 
4740 



<212> Type : DNA 

<211> Length : 4740 

SequenceName : SEQ ID 
SequenceDescription : 



393 



10 

Seq[uence 

<213> OrganistnName : Escherichia coli 0157:H7 
<4.00>^PreSequerxQeString 

15 atgattactc atggttgtta tacccggacc cggcacaagc ataagctaaa aaaaacattg 60 

attatgctta gtgctggttt aggattgttt ttttatgtta atcagaattc atttgcaaat 12 0 

ggtgaaaatt attttaaatt gggttcggat tcaaaactgt taactcatga tagctatcag 180 

aatcgccttt tttatacgtt gaaaactggt gaaactgttg ccgatctttc taaatcgcaa 240 

gatattaatt tatcgacgat ttggtcgttg aataagcatt tatacagttc tgaaagcgaa 3 00 

20 atgatgaagg ccgcgcctgg tcagcagatc attttgccac tcaaaaaact tccctttgaa 3 60 

tacagtgcac taccactttt aggttcggca cctcttgttg ctgcaggtgg tgttgctggt 420 

cacacgaata aactgactaa aatgtccccg gacgtgacca aaagcaacat gaccgatgac 480 

aaggcattaa attatgcggc acaacaggcg gcgagtctcg gtagccagct tcagtcgcga 540 

tctctgaacg gcgattacgc gaaagatacc gctcttggta tcgctggtaa ccaggcttcg 600 

25 tcacagttgc aggcctggtt acaacattat ggaacggcag aggttaatct gcagagtggt 660 

aataactttg acggtagttc actggacttc ttattaccgt tctatgattc cgaaaaaatg 720 

ctggcatttg gtcaggtcgg agcgcgttac attgactccc gctttacggc aaatttaggt 780 

gcgggtcagc gttttttcct tcctgcaaac atgttgggct ataacgtctt cattgatcag 840 

gatttttctg gtgataatac ccgtttaggt attggtggcg aatactggcg agactatttc 900 

30 aaaagtagcg ttaacggcta tttccgcatg agcggctggc atgagtcata caataagaaa 960 

gactatgatg agcgcccagc aaatggcttc gatatccgtt ttaatggcta tctaccgtca 1020 

tatccggcat taggcgccaa gctgatatat gagcagtatt atggtgataa tgttgctttg 1080 

tttaattctg ataagctgca gtcgaatcct ggtgcggcga ccgttggtgt aaactatact 1140 

ccgattcctc tggtgacgat ggggatcgat: taccgtcatg gtacgggtaa tgaaaatgat 1200 

35 ctcctttact caatgcagtt ccgttatcag tttgataaat cgtggtctca gcaaattgaa 1260 

ccacagtatg ttaacgagtt aagaacatta tcaggcagcc gttacgatct ggttcagcgt 13 20 

aataacaata ttattctgga gtacaagaag caggatattc tttctctgaa tattccgcat 13 8 0 

gatattaatg gtactgaaca cagtacgcag aagattcagt tgatcgttaa gagcaaatac 1440 

ggtctggatc gtatcgtctg ggatgatagt gcattacgca gtcagggcgg tcagattcag 1500 

40 catagcggaa gccaaagcgc acaagactac caggctattt tgcctgctta tgtgcaaggt 1560 

ggcagcaata tttataaagt gacggctcgc gcctatgacc gtaatggcaa tagcfcctaac 1620 

aatgtacagc ttactattac cgttctgtcg aatggtcaag ttgtcgacca ggttggggta 1680 

acggacttta cggcggataa gacttcggct aaagcggata acgccgatac cattacttat 1740 

accgcgacgg tgaaaaagaa tggggtagct caggctaatg tccctgtttc atttaatatt 18 00 

45 gtttcaggaa ctgcaactct tggggcaaat agtgccaaaa cggatgctaa cggtaaggca 1860 

accgtaacgt tgaagtcgag tacgccagga caggtcgtcg tgtctgctaa aaccgcggag 1920 

atgacttcag cacttaatgc cagtgcggtt atattttttg atcaaaccaa ggccagcatt 1980 

actgagatta aggctgataa gacaactgca gtagcaaatg gtaaggatgc tattaaatat 2040 

actgtaaaag ttatgaaaaa cggtcagcca gttaataatc aatccgttac attctcaaca 210 0 

50 aactttggga tgttcaacgg taagtctcaa acgcaagcaa ccacgggaaa tgatggtcgt 2160 

gcgacgataa cactaacttc cagttccgcc ggtaaagcga ctgttagtgc gacagtcagt 22 20 

gatggggctg aggttaaagc gactgaggtc actttttttg atgaactgaa aattgacaac 22 80 

aaggttgata ttattggtaa caatgtcaga ggcgagttgc ctaatatttg gctgcaatat 23 40 

ggtcagttta aactgaaagc aagcggtggt gatggtacat attcatggta ttcagaaaat 24 00 

55 accagtatcg cgactgtcga tgcatcaggg aaagtcactt tgaatggtaa aggcagtgtc 2460 

gtaattaaag ccacatctgg tgataagcaa acagtaagtt acactataaa agcaccgtcg 2520 

tatatgataa aagtggataa gcaagcctat tatgctgatg ctatgtccat ttgcaaaaat 2580 

ttattaccat ccacacagac ggtattgtca gatatttatg actcatgggg ggctgcaaat 2 640 

aaatatagcc attatagttc tatgaactca ataactgctt ggattaaaca gacatctagt 2700 

60 gagcagcgtt ctggagtatc aagcacttat aacctaataa cacaaaaccc tcttcctggg 2760 

gttaatgtta atactccaaa tgtctatgcg gtttgtgtag aataa 2805 
<212> Type : DNA 
<211> Length : 2805 

SequenceName : SEQ ID 394 

65 SequenceDescription : 



Sequence 
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<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

atgttagtac ttagcgaaag cttcaagaat aaattgcttc ccatgaatgg gtatatgaaa 60 

5 ggcggcagcg actccggatc taaagcccag gcacgcgcaa ctgaaaaggg catcgaactg 120 

cagcgtgaaa tgtggcagac gaacatgcaa aaccttgcac cgttcacgcc actcgctcag 180 

cagtacgtat cacagttgca gaatctttcc tctcttcagg ggcaaggtca ggcgcttaac 240 

cagtattaca actctcagca gtataaagac cttgcagggc aggcgcgcta tcagagtctg 30 0 

gcagcagcag aggcaacggg tggattaggc tctacagcaa caggaaacca gttagcagca 3 60 

10 atcgcaccta cactcggtca aaactggctg tcaggtcaga tgaacaacta caacaatctg 420 

gcaaatatcg gccttggtgc tcttacaggt caggcaaacg ccggacagaa ctacgctaac 480 

aatgtcagcc aattgtatca acagcaggcg gcagcatcgg cagcaaatgc gaataagcct 54 0 

tcaggcctac agagttttgc tacaggtgcc attggtgggg ccgcatcagg tgcaatgatt 60 0 

ggtagtgcag ttcctgttp^t tgggactggt attggtgctc ttgctggcgg tgttatcggt 660. 

15 ggtcttggat cattgtttta a 681 
<212> Type : DNA 
<211> Length : 681 

SequenceKTame : SEQ ID 395 
SequenceDescription : 



20 



Sequence 



<213> OrganisTtiName : Escherichia 
<400> PreSequenceString : 

25 atgaaaaaaa tattatcagg gttgattctg 
aatggtgatg gcgcaacgca catgtcaaat 
gcgaataacc actccggata caatattttc 
ccggtgcgct gtcactgtga tgacacgcat 
cctatcttct acacgggaga tgccgcaccg 

30 ttaaattact atgctctgaa tgattattta 
aaccaatatg cggccattcc ttttgaacac 
acctgtggag caggtaataa tgggagcact 
ttatctttct atgttcggca ttctattact 
gcctggttgt acgcgggcat gtccgatcat 

35 acaattcgcg gacaactaac ggccccgcag 
gatgtcgatt ttcaaaaaat taatagcgct 
gcagaaagaa agattaaaac cgaagtcaca 
tccacggagg tggtgagtgc gtcgatgatt 
atcgtgacga gtaatccgga tgtgggaatt 

40 aatgtggatg ggggcaactt acccgctgat 
gatggtagcg taacgtttta ttcagcgccc 
gataatggat ttaccgctac agccacgctg 

<212> Type : DNA 
45 <211> Length : 1071 

SequenceName : SEQ ID 396 
SequenceDescription : 



coli 0157 :H7 

ctgctttgct gtccttatgg tttcgccgct 60 

ttatcatttg gtccgctgac ggtggcagcg 12 0 

gaggcactga gcaacacgac tggaacatac 180 

ggcggaccgg gccaacaaac agcatttttt 240 

gggcttgtgc ttgagcgcac tcttaatggg 3 00 

tcggtcggcg tgacgatttt tattattaat 3 60 

ttatccaacc aatccacctc accgcaacat 420 

gtaaatctgg attcagggcg ctcggcaaaa 48 0 

ggcacggtga caatacccac aacggaagtc 540 

tttcccaaaa cgacccccgt ttctaaagtg 6 00 

aactgtgagt taacgccaaa tcagagcatc 660 

gagttctcct caacggcggg ttcaattatt 72 0 

gtatcctgta ccgggatgga agacgtaagg 780 

gcggcaaaca gaagtgccga tgccaccatg 840 

aagatttttg ataagaacga ccgtccagtg 900 

atgggtgcta ttagtcgatt aggaaaaacc 960 

gccagtctga cgggcgcaaa accagcgcct 1020 

gttattgaat ttactaacta a 1071 



Sequence 

50 

<213> OrganisinKrame : Escherichia 
<400> PreSecpaenceString : 
atgaataaaa tatatcggct aaagtggaac 
gagctgggga gcagagtaaa aggaaaaaag 

55 ttatattcat ctctggtatt cgccgatgat 
tttggcaaag agaaccagag catcgattac 
gtaatcaatg cgacagatac ttcccgtccg 
gatattaccg gaggaaaggt aacaatcaat 
gggttcctga atgtctccaa tgctggcagc 

60 aactcaggca tgagacacga tcgcggctat 
gttaagggca ccagccgtct gacctatttg 
gtaaattccg aaaccttctt tatgggcgtt 
tcagttaata acggcggtga agttaatgcc 
caagtctccg atacaacact tgctgtttcg 

65 agtttaagca ccaactctga gttagcgtta 
gcagggatta ttgatgccga aaaaattgag 
atcaccttaa accacacgga taaagacgcg 



coli 0157 :H7 

aggtcccgta actgttggag cgtctgctcg 60 

tcccgggctg ttttaattag cgcgataagt 120 

gtcatcgtaa accaggataa aactattgat 18 0 

cgtattacgg tgacagacaa tgccaatctg 24 0 

cgtctgactc tcgcttctgg tggtgggttg 3 00 

ggcccactta actttttgct gaaaggtacg 3 60 

gagttatatg ctgatgattt gtatgaatca 420 

tttaatgtct ccaacggcgg caaaatccat 480 

cagggaaatg tcagtggtga aggtagccag 540 

tacggcagtt acggtggtaa tcagtacctg 600 

aggaagcaaa ttagcctggg ctattatgat 660 

gaaggtggta aaatttctgc gcctactatt 720 

ggggcacagg aaggaagcgc agcgaaggca 7 80 

tttgtgtggg caaagacatc cgagaagaaa 840 

actatttccg cggatattgt cagtggcagc 900 
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gagggcctgg gctatatcaa tgcgctcaat ggcacgactt acttaaccgg tgataactct 960 

gcctttagtg gtaaagtcaa aattgagcaa aatggcgctt tagggatcac ccaaaatata 1020 

ggtacagcag agatcaacaa ccgcgggaaa ttacaccfcga aggctgacga fcagcatgacc 1080 

tttgccaata agatctctgg caacggtaca ataagtatcg acagtgggac ggtggagttg 1140 

5 accggcaata actatgcatt cagcggatat attgatgttg cttctggtgc tgtcgctgtt 12 00 

atttctgaag acaagaatat cggtcgtgca gagctggatg tcgatggcaa attgcaaatt 1260 

aatgccaaca aagattgggt atttgataac gatcttgaag gtagaggcat tgttgaaata 1320 

aacatgggga atcacgaatt ctccttcgat gagtttgctt atacagactg gttccagggt 13 80 

tcactggcgt tccagaacac gacatttaat ctggaaaaga atgctgagtt tctgcagaaa 1440 

10 ggcgggatca ctgcgggtca gggaagcctg gtaacagtgg gtaagggcgc tcactccatt 1500 

agcactttgg gattctccgg cggaaccgtt gattttggtg ccctgacagc aggtgcacag 1560 

atgacagaag ggacggtcaa cgttagtaaa acgctggatt tgcgcggcga gggtgtgatt 1620 

caggtttctg acagtgacgt tgtccgctca gtatctcgtg atattgactc tgcgttatcg 168 0 

ctcactgaag tcgatgatgg. taacagcacc attaagttgg ttgatgcgca aggtgcggaa 1740 

15 gttctgggcg atgcgggcaa tctgcaattg caggataaaa atgggcaaat cctctccagc 18 0 0 

agcgcccaac gtgatattca gcagaatggg caaaaagcgg ccgtcggcac ttacgactat 1860 

cgtctgacga gtggggtaaa caatgacggt ctgtatattg gttacggcct gacccagctt 1920 

gatttacacg ctaccgacag cgatgctctg gtgctgagct ctaacggtaa aagcgagaat 1980 

gccgccgatc tcagcgcaaa gattaccggc agtggtgacc tggcattcag cagccagaag 2040 

20 ggtcagaccg tatcgctttc taacaaagac aacgactata ccggtgttac cgatctgcgc 2100 

agtgggacgc ttttgttgaa taacgataac gtgttgggta atacccatga actgcgtctg 2160 

gcggcagaga ctgaactgga catgaatggt cacagccaga ccgtgggcac gctcaatggc 2220 

agcgccgatt cactgctgag cttaaatggc ggcagtctga cggttaccaa cgggggcact 228 0 

tcaaccggtt cgttaacggg gagcggagag ctgaatattc agggcggcac gctggacatc 2340 

25 gcgggcgata acagcaacct gacggcgaat gtgaacattg ctaattcggc taatgtcctg 2400 

gtaagtcatg cgcagggatt gggtagcgca aacgttgaga acaacggtac cctggcgttg 2460 

aataatagcg ctgaaaaaag agcggctgcg tctgtgaatt acgccctggg cggcaatctg 252 0 

accaacaacg gtacgctgat gaccggaatg tcaggacagc aagctggcaa tgtgttagtg 2580 

gtgaagggga actaccacgg taataacggt caactagtaa tgaatacggt actgaatggc 2640 

30 gatgactcag taaccgataa attggttgtc gagggcgata ctagcggcac gactgccgtt 2700 

acggtgaata acgctggcgg tacaggtgcg aaaaccctta acggtatcga acttatccat 2760 

gtagacggta agtctgaggg cgaatttgtt caggctgggc gtatcgttgc gggggcgtat 2820 

gactacactc tcgcgcgtgg acaaggggca aatagtggta actggtatct gaccagcggc 2880 

agtgattctc ctgaactgca gccggagcca gacccgatgc cgaatccaga gccaaacccg 2940 

35 aafcccagagc cgaaccctaa cccgacacct acgccgggtc cggatctgaa tgtggataat 30 00 

gacctgcgac cggaggcggg tagctacatt gcgaaccttg cagcagcgaa taccatgttc 3060 

accacgcgtc tgcatgagcg tctgggtaat acgtactata ccgacatggt gacgggtgag 3120 

cagaaacaaa ccactatgtg gatgcgccat gaaggtggtc ataataaatg gcgtgatggc 3180 

agcggccagc tgaaaaccca aagcaatcgc tatgttctgc aactgggagg cgatgtcgcg 3240 

40 cagtggagcc aaaacggcag cgaccgctgg catgttgggg tcatggcggg atatggcaac 33 00 

agcgacagca aaaccatttc ctcgcgaacc ggttatcgtg caaaagcgag tgtgaacgga 33 60 

tatagcacag gcctctatgc cacctggtat gccgatgacg agtcgcgtaa tggcgcgtat 3420 

ctcgacagtt gggcgcagta cagctggttt gataacacag tgaaagggga tgacttacaa 3480 

agtgaatcct ataaatcaaa aggatttacc gcttcactgg aagctggata caaacacaaa 3540 

45 ttagctgaat ttaatggcag ccagggaacg cgtaatgaat ggtatgttca gccgcaagca 3600 

caggttacct ggatgggagt caaagccgat aagcaccgcg aaagcaacgg aaccctcgtt 3660 

catagcaacg gtgatggcaa tgttcaaacc cgacttggcg taaaaacctg gctgaagagc 3720 

caccataaaa tggatgacgg ta^atcccgc gagttccagc cgtttgtaga agtgaactgg 3780 

ctacataaca gtaaggattt cagcaccagt atggatggcg tgtctgtcac tcaggatgga 3 840 

50 gcccgaaata ttgctgagat aaaaaccggg gtggaaggac agctaaatgc caacctgaat 3900 

gtctggggga atgtgggcgt tcaggttgcc gataggggat ataatgacac ctctgcaatg 3 960 

gttggcatta agtggcaatt ctga 3984 
<212> Type : DNA 
<211> Length : 3984 

55 SequenceName : SEQ ID 397 

■ Sec[uenceDescription : 

Sequence 



60 <213> OrganistnNarae : Escherichia coli 0157 :H7 
<400> PreSequenceString : 

atgataacga tgaaaaaaag tgtattgacg gcgtttataa ctgtggtatg tgcaacgtcc 60 

agcgttatgg ctgctgatga taatgctatc acggatggct cagtaacatt taatggtaaa 120 

gttattgctc cagcttgtac cctggtagct gcgacgaaag attccgtggt gactttgcca 180 

65 gatgttagtg ccacgaagtt gcaaaccaat ggtcaggttt ctggcgtgca aactgatgtg 240 

ccaattgaafc taaaagattg tgatactacc gtaacaaaaa atgcaacgtt cacctttaat 300 

ggcactgcgg atactactca gattacagcg tttgctaacc aggcctcatc tgatgctgct 360 
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acaaacgtgg ccctgcaaat gtatatgaat gatggtacaa cggccatcaa gccagacaca 420 
gaaaccggga acattttgtt gcaagatgga gafccagacgt tgacttttaa agttgattat 480 
atcgctacgg ggaaagcgac ttctggtaat gtgaatgcgg taacaaattt ccatattaac 540 
tattattaa 549 
5 <212> Type : DMA 

<211> Length : 549 

SequenceName : SEQ ID 398 

SequenceDescription ; 

10 Sequence 



30 



<213> OrganistnName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

atgagtaagt ttgtaaaaac agctattgct gcaacaatgg taatgggtgc gtfccgcttct 60 

15 acttcaacaa tcgccgctgg caacaatggt acagcacgtt tctacggcac cattgaagat 120 

tccccgtgct ctatcgttcc tgatgatcac aaactggaag ttgatatggg tgacattggt 180 

tcagggatcc tgaaaaataa cgggacttct acaccgaaag ctttccagafc ccatctgcaa 240 

gactgtgtgt ttgacaccca gacaacgatg accactacct tcaccggtaa cgcgtcttct 300 

accaacagcg gcaattacta caccatttac aataccgata ctggtgcggc atttaacaat 3 60 

20 gtcagcctgg ccattggtga cgctcaggga acctcttata aaagcggcgc gggtatcgaa 420 

cagaaaatcg taaacgatac ggcgaccaac aaaggcaaag cgaagcagac gctggacttt 48 0 

aaagcctggc tggtgggcgc tgctgatgcg ccagatttag gtaattttga agccaacacg 540 

accttccaga ttacttatct ctaa 564 
<212> Type : DNA 
25 <211> Length : 564 

SequenceName r SEQ ID 399 
SequenceDescription : 



Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<40 0> PreSequenceString : 

atgcgggtta tctttctacg caaggagtat ttatctctac tcccgtcaat gattgcatct 60 

cttttctctg ctaacggtgt cgcggcggcc attgatttat gccagggata tgatatcaaa 120 

35 gcgagttgtc acgccagcag gcaaagcctt tcaggcatta cgcaggtctg gagtattgcc 180 

gatgggcaat ggctggtttt ttcggatatg accaataatg ccagcggtgg ggccgtattt 240 

ttgcaacaag gagcggaatt tacattatca ccagaaaatg aaactggaat gactctgttt 3 00 

gccaataaca ccgtttcagg agaatataat aacggcgggg caatatttgc taaagaaaac 3 60 

tcaacgctga atcttacgga tgttattttt tctggtaacg tcgcaggcgg ctatggtggc 420 

40 gcaatctatt cttctggtac taacgatacc ggtgccatcg atttacgtgt cactaacgcc 480 

gtgtttcgca ataacatcgc taatgacggc aaaggtggtg caatttatac catcaataat 540 

gatatctatt taagtgatga tgtttttaac aataaccagg catatacatc aacaagttac 600 

agtgatggcg atggcggcgc aatcgatgtc acagataata atagcgacag caagcatcct 660 

tcaggttata cgataataaa taacactgcc tttacaaata acactgccga aggttatggc 720 

45 ggggcgatat ataccaatag cgcgacggct ccctatctta ttgatatttc tgttgatgac 780 

agctacagcc agaacggagg cgtgttagtc gatgagaaca atagcgcagc aggctatgga 840 

gatggtcctt cctctgcggc gggtggcttt atgtatctcg gcttaagtga agttaccttt 900 

gatattgccg acggaaaaac gctggttatt ggcaatacag agaatgacgg agctgttgac 960 

tctattgctg gtaccgggtt aatcaccaaa acaggttccg gcgatctggt acttaatgca 102 0 

50 gataacaatg actttactgg tgagatgcag attgaaaacg gtgaagttac cctgggccgc 1080 

agcaactccc tgatgaatgt cggcgatacg cattgccagg acgatccgca agactgctac 1140 

ggtctgacga tagggagtat tgataagtac cagaatcagg cagagctaaa tgttggctcc 1200 

acccaacaaa cctttgcgca ctcattgacg ggctttcaga atggcacttt aaatatcgat 1260 

gctggtggca atgttactgt taatcaaggc agttttgctg gcaccatcga aggtgctggt 1320 

55 cagctcacca ttgcgcaaaa cggcagctat gtgctggcgg gggcgcagtc gatggcgcta 13 8 0 

accggcgata tagtggtgga tgctggtgcg gtgctttcgc tggaaggcga cgcggcagat 144 0 

cttgccgctc tccaggacga tccgcagtcg atcgtgttaa acggcggtat gctcgatctc 1500 

tctgatttct ccacctggca gagcggtaca tcatacaaag atggccttga agtcagtggc 1560 

agcagcggaa cggttatcgg cagtcaggat gtggtagatc ttgcaggcgg aaacgatatg 1620 

60 catatcggcg gcgacgggaa agatggcgtc tacgtggtga tcgatgcggg tgacgggcag 168 0 

gtcagcctgg caaatgacaa tcaatacctc ggcacaacgc aaatcgcttc cggtacgctg 1740 

atggtgagcg acaactcgca gcttggatat acccattata accgccaggt tatctttacc 1800 

gataagccac aagaaagcgt gatggagatt actgccaatg tcgatactcg ctctacaacg 1860 

actgagcatg ggcgtgatat tgaaatgcgc gccgacggtg aagtggcagt tgatgcgggg 192 0 

65 gtagacacgc agtggggcgc actgatggct gacagcagcg ggcagcatca ggatgagggt 198 0 

agcacattga ctaaaacggg ggcgggtaca ctggagctga ccgccagcgg tacaacgcag 2040 

tcagcggtga gagtagaaga gggcacgctg caaggtgatg ttgcggatat cttcccttat 2100 
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gcttcgtcgc tatgggtcgg tgacggggca 
cagtcaattg atgctacttc cagcggcact 
ctgaccgggc aggatacttc cgtcgccctt 
ctggtgaatg ccaccgatgg tgtgacgttg 
5 gacagcctga cttatctttc caacgtgacg 
gcggttagcc tgcaaaatgg cgtcgctggc 
ggcggcggta cgctactgct cgatagcgaa 
ttggtgatga acggtaatac tgctggcaac 
attggtgagc cgacatcgac aggcattaaa 

10 tttcaaaaca atgcgcagtt cagtctggca 
gactacacgc tggtggaaga taacaacgac 
ccatcgccac ctgatccaga cccgactccc 
acacccgacc cggaacctac gcctgcttac 
tatctcaata acctgcgggc ggcaaatcag 

15 ggtggcgatg gtcagacgct gaatttacgt 
gcggggcaac tggctcaaca tgaagacact 
agcgggcgct ggggcacgga tggcgagtgg 
aaccagggcg acagccgctc gagtatgacc 
tatgcggttg ggctgacctc aagctggttt 

20 ctggataact ggttgcagta cgcgtggttt 
gtggatcatt accattcgtc ggggattatc 
ccggggcgtg gtgtggtgat tgaaccgcag 
gatgatttta ccgccgctaa ccgtgcgcgc 
acgcggctgg gtttacacag cgaatggcgt 

25 ctgaattatt atcacgatcc ccattcgacg 
gacgatgcgg tgaagcaacg gggtgaaata 
cgagtttcgc tgcgtggtag cgtggcgtgg 
gcagggtttt tgtcgatgac ggtgaaatgg 
<212> Type : DNA 

30 <211> Length : 3753 

SequenceName : SEQ ID 400 
SequenceDescription : 



acgttcgtta ctggcgcgga tcaggatatt 2160 

atcgacatca gcgatggtac ggttttgcgc 222 0 

aatgcctcac tgtttaactg cgatgggacg 2280 

acaggtgagc ttaataccaa ccttgaaact 2340 

gttaatggca atctgaccaa tacgtccggt 240 0 

gatacgctga cggtaaacgg tgattatacc 2460 

ttaaacggcg atgactcggt aagcgatcaa 2520 

acaactgtgg tggttaactc cattacaggg 2580 

gtggttgatt tcgcagctga tcccacgcag 2640 

ggcagcggct acgtcaatat gggagcgtat 27 0 0 

tggtatctgc gatcgcaaga agtaacgccg 2760 

gatcctgatc ccacgcagga tcctgatcca 282 0 

cagccggtgt tgaatgccaa agttggcggt 28 80 

gcgtttatga tggagcgacg cgatcacgca 2940 

gttatcggcg gagattatca ttacacagca 30 00 

tctacggtgc aacttagcgg cgatctgttt 3 060 

atgcttggga ttgttggtgg ctacagcgat 3120 

ggaactcgcg ccgataacca gaaccacggt 3180 

cagcacggta agcagaagca gggggcctgg 3240 

agcaatgatg tttctgaaca tgaagatggc 33 00 

gcctcgctgg aagcggggta tcagtggtta 33 60 

gcgcaggtga tttatcaggg cgfcgcagcag 3420 

gtgtcacaat cgcagggtga tgatattcag 3480 

accgctgttc atgtcatacc aacattagat 3 540 

gaaattgaag aagatgccag cactatcagt 3 600 

aaagtgggag tcacgggcaa tatcagtcag 3660 

cagaaaggga gtgatgattt tgcccagacg 3720 

taa 3753 



35 



Sequence 

<213> OzganistriName : Eschericliia coli 0157:H7 
<400> PreSequenceString -. 

atgcactcct ggaaaaagaa acttatagta tcacaattag cattggcttg cactctggct 
atcacctctc aggctaatgc agcgaccaac gatatttctg gtcaaactta caatactttc 



gtgtttaata acactattac tgttaaagat tctactgtga cctctggttc atggactgat 
50 gaaggtacta ctggttggtt tggccatact ggtaatgcca gcaactatag caacacgctg 



60 
120 



40 catcactaca acgacgccac ctatgcfcgat gacgtttact atgatggtta tgtaggctgg 180 
aacaactatg ccgctgatag ctattacaac ggcgatatct acccggtcat taataacgct 
accgttaacg gcgtgatttc tacctactat ctggacgacg gtatttctac caataccaac 
gccaatagtc tgacaatcaa aaacagcact attcacggta tgattacctc tgagtgcatg 

actactgatt gtgctgatga ccgtgctact ggttatgttt atgatcgtct gacactgagc 420 

45 gttgataatt caacgatcga tgacaactac gagcattata cttacaacgg tacctataat 480 

aatgccgctg acactcatgt tgtagatgtt tacgatatgg gtactgctat tacactggat 540 

caggaagttg atctgtccat cactaataac tctcatgtag caggtattac gctgactcag 600 

ggttatgagt gggaagatat tgacgacaac acagtcagca ctggcgtaaa cagcagcgaa 660 

780 



240 
300 
360 



actgcagacg atgttgcaat tgccgcaatc gcaaatccgt atgctgataa tgcgatgcag 84 0 

actacagtaa ctttagacaa ctcaacactg atgggtgatg ttgttttctc cagtaatttc 900 

gatgaaaact tcttcccgca aggtgctaac agctatcgcg atgctgatgg tgatgtagat 960 

accaacggtt gggatggcac agaccgtatg gatgtgactc tgaacaacgg cagcaagtgg 102 0 

55 gttggcgctg caatgtctgt tcatatggtt gatgaagatg gtgatggttc ttacgacgga 108 0 

tatgctgttg gtactgaagc aactgcaact ctgctcgata ttgcagctaa cagcctgtgg 1140 

ccttcatcaa ctgtcggtgt tgataacatc aatactcaat atgacgaaaa tggccatatc 120 0 

gtaggaaacg aagtttacca gagcggtttg tttaatgtga ctttgaacgg tggttcagag 1260 

tgggatacaa caaaatcttc tctgattgat actttaagta ttaacagcgg ttcccaagtt 132 0 

60 aatgttgcag actctcgtct gatctctgac actgtctctc tgactggcgg ttctaacctg 13 80 

aacatcggtg aagacggtca tgtagcgact aataccctga ccatcgacaa tagtaccgtt 1440 

aaaatgtctg atgatgtttc tgcgggctgg ggtttagaag atgctgcact gtacgcaaat 1500 

accatcaccg taactaacga cggtctgttg gatattaacg ttgatcagtt cgatgctaac 1560 

ccgttccagg ccgataccct gaatctgacc agtaccactg atactaacgg caacattcac 162 0 

65 gctggtgtat tcgatatcca tagcagtgat tacgtaatgg ataccgatct ggtcaacgat 1680 

cgtaccaacg atactaccaa gtcaaactac ggttatggct taatcgcaat gaactctgat 1740 

ggtcacctga ctattaacgg taacggcgat aacgacaaca ctgcttctat cgaagctggt 1800 
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cagaacgaag ttgataacaa cggtgaccat gttgcagccg cgaccggtaa ctacaaagtt 
cgtatcgaca acgctactgg tgctggttct atcgctgact acaacggcaa cgagctgatc 
tacgtcaacg acaaaaacag caacgcgacc ttctctgctg ctaacaaagc tgacctgggt 
gcatacacct atcaggctga acagcgcggt aacaccgttg ttctgcaaca gatggagttg 
accgactacg ctaacatggc gctgagcatc ccatctgcga acaccaatat ctggaacctg 
gaacaagaca ccgttggtac tcgtttgacc aactctcgtc atggcctggc tgataacggc 
ggcgcatggg taagctactt cggtggtaac ttcaacggcg acaacggcac catcaactat 
gatcaggatg ttaacggcat catggtcggt gttgatacca aaattgacgg taacaacgct 
aagtggatcg tcggtgcggc tgcaggcttc gctaaaggtg acatgaatga ccgttctggt 
caggtggatc aagacagcca gactgcctac atctactctt ctgctcactt cgcgaacaac 
gtctttgttg atggtagctt gagctactct cacttcaaca acgacctgtc tgcaaccatg 
agcaacggta cttacgttga cggtagcacc aactccgacg cttggggctt cggtttgaaa 
gccggttacg acttcaaact gggtgatgct ggttacgtga ctccttacgg cagcatttct 
ggtctgttGC agtctggtga tgactaccag ctgagcaacg acatgaaagt tgacggtcag 
tcttacgaca gcatgcgtta tgaactgggt gtagatgcag gttatacctt cacctacagc 
gaagatcagg ctctgactcc gtacttcaaa ctggcttacg tctacgacga ctctaacaac 
gataacgatg tgaacggtga tfcccatcgat aacggtactg aagggtctgc ggtacgtgtt 
ggtctgggta ctcagttcag cttcaccaag aacttcagcg cctataccga tgctaactac 
ctcggtggtg gtgacgtaga tcaagactgg tccgcgaacg tgggtgttaa atatacctgg 
taa 



1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
2943 



<212> Type : DNA 

<211> Length : 2943 

SequenceName : SEQ ID 401 
SequenceDescription : 

Seqpaence 



<213> OrganismUanie : Escherichia coli 0157 :H7 
<400> PreSequenceString : 

atgaaactca aacatgttgg tatgattgtc gtttctgtgt tggcgatgtc gtctgctgcg 60 
gtaagcgcag ccgagggtga tgaatcagta acgaccactg ttaatggcgg tgttattcat 120 
tttaaaggtg aagtggtaaa tgccgcttgt gcgattgatt ccgaatcaat gaaccaaacg 18 0 

gttgagctgg gtcaggttcg ttcttctcgc ctggctaaag cgggtgacct cagctccgcc 240 
gttggcttca atatcaagct gaatgattgt gataccaatg tttccagtaa tgcagctgtt 3 00 

gcattcctgg gtactactgt caccagtaat gacgatacgt tagcgctgca gagttcagcg 360 
gcaggctctg cccaaaatgt cggtattcaa attttggacc gtacgggtga ggtattaata 420 
cttgatgggg ccacttttag tgctaaaacc gacttgattg atggcacgaa tatactacca 480 
ttccaggctc gttatattgc tctcgggcag tccgtagctg gtactgcaaa cgcagatgcg - 540 
accttcaaag ttcaatatct ataa 564 
<212> Type : DKTA 
<21X> Length : 564 

SequenceName : SEQ ID 402 

SequenceDescription : 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

atgaaacttt taaaagtagc agcaattgca gcaatcgtat tctccggtag cgctctggca 60 
ggtgttgttc ctcagtacgg cggcggtggc ggtaaccacg gtggtggcgg taataacagc 120 
ggcccgaatt cagagctgaa tatttatcag tacggtggtg gtaactctgc acttgctctg 18 0 

caagctgatg ctcgtaactc tgatcttact attacccagc atggtggtgg taacggtgca 240 
gatgttggtc agggctcaga tgacagctca atcgatctga cccaacgtgg ctttggtaac 3 00 

agcgccactc ttgatcagtg gaacggtaaa gactctcata tgacagttaa acaattcggt 360 
ggcggcaacg gtgcagcggt tgaccagact gcatctaatt ccaccgtcaa cgtaactcag 420 
gttggctttg gtaacaacgc gaccgctcat cagtactaa 459 
<212> Type : DNA 
<211> Length : 459 

SequenceName : SEQ ID 403 

SequenceDescription : 

Sequence 



<213> OrganismName : Escherichia coli 0157 :H7 
<400> PreSequenceString : 

atgcctattg gtaatcttgg tcataatccc aatgtgaata attcaattcc tcctgcacct 60 
ccattacctt cacaaaccga cggtgcaggg gggcgtggtc age teat taa ctctacgggg 120 
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ccgttgggat ctcgtgcgct atttacgcct gtaaggaatt ctatggctga ttctggcgac 18 0 

aatcgtgcca gtgatgttcc tggacttcct gtaaatccga tgcgcctggc ggcgtctgag 240 

ataacactga atgatggatt tgaagttctt catgatcatg. gtccgctcga tactcttaac 300 

aggcagattg gctcttcggt atttcgagtt gaaactcagg aagatggtaa acatattgct 360 

5 gtcggtcaga ggaatggtgt tgagacctct gttgttttaa gtgatcaaga gtacgctcgc 420 

ttgcagtcca ttgatcctga aggtaaagac aaatttgtat ttactggagg ccgtggtggt 480 

gctgggcatg ctatggtcac cgttgcttca gatatcacgg aagcccgcca aaggatactg 540 

gagctgttag agcccaaagg gaccggggag tccaaaggtg ctggggagtc aaaaggcgtt 600 

ggggagttga gggagtcaaa tagcggtgcg gaaaacacca cagaaactca gacctcaacc 660 

10 tcaacttcca gccttcgttc agatcctaaa ctttggttgg cgttggggac tgttgctaca 720 

ggtctgatag ggttggcggc gacgggtatt gtacaggcgc ttgcattgac gccggagccg 780 

gatagcccaa ccacgaccga ccctgatgca gctgcaagtg caactgaaac tgcgacaaga 840 

gatcagttaa cgaaagaagc gttccagaac ccagataatc aaaaagttaa tatcgatgag 900 

ctcggaaatg cgattccgtc aggggtattg .aaagatgatg ttgttgcgaa tatagaagag 960^ 

15 caggctaaag cagcaggcga agaggccaaa cagcaagcca ttgaaaataa tgctcaggcg 1020 

caaaaaaaat atgatgaaca acaagctaaa cgccaggagg agctgaaagt ttcatcgggg 1080 

gctggctacg gtcttagtgg cgcattgatt cttggtgggg gaattggtgt tgccgtcacc 1140 

gctgcgcttc atcgaaaaaa tcagccggta gaacaaacaa caacaacaac tactacaact 1200 

acaactacaa gcgcacgtac ggtagagaat aagcctgcaa ataatacacc tgcacagggc 1260 

20 aatgtagata cccctgggtc agaagatacc atggagagca gacgtagctc gatggctagc 1320 

acctcgtcga ctttctttga cacttccagc atagggaccg tgcagaatcc gtatgctgat 1380 

gttaaaacat cgctgcatga ttcgcaggtg ccgacttcta attctaatac gtctgttcag 1440 

aatatgggga atacagattc tgttgtatat agcaccattc aacatcctcc ccgggatact 1500 

actgataacg gcgcacggtt attaggaaafc ccaagtgcgg ggattcaaag cacttatgcg 1560 

25 cgtctggcgc taagtggtgg attacgccat gacatgggag gattaacggg ggggagtaat 1620 

agcgctgtga atacttcgaa taacccacca gcgccgggat cccatcgttt cgtctaa 1677 

<212> Type : DNA 
<211> Length. : 1677 
30 SequenceName : SEQ ID 404 

SequenceDescription : 

Sequence 

35 <213> OrganistnName r Escherichia coli 0157:H7 
<400> PreSequertceString : 

atgttttcta ctttcaaaaa agcagctctg ctggcagcfca ttgcattacc tttttcaact 60 

atggctgcgc ctacagtcac ttttcagggt gaagtaaccg atcagacctg ttccgtaaat 120 

atcaacggtc aaaccaattc agtagtattg atgccgaccg tagccatggc tgacttcggt 180 

40 gcaactttag ctgatggtca gagcgcaggc cagacgccgt ttacggtttc tgtgtctaac 240 

tgccaggctc caactggtgc agatcaggca atcaacacca ccttcctggg ctacgacgtt 300 

gacgctagca cgggtgttat gggaaaccgt gataccagca gcgatgcggc gaaaggcttt 360 

ggcattcagt taatggattc cagcacttct ggtaacccag taactctggc tggcgcgact 420 

aacgtaccgg gtctgaccct gaaagttggc gataccgaag ccagctacga cttcggtgcg 480 

45 cgttacttcg ttatcgatag cgctgctgcc actgccggta aaattaccgc tgtcgcagaa 540 

tacaccctga gctacctcta a 561 
<212> Type : DNA 
<211> Length : 561 

SequenceName : SEQ ID 405 

50 SequenceDescription : 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 

55 <400> PreSequenceString : 

atgaacagtg aaggaggaaa accggggaat gtactgaccg ttaacggcaa ctataccgga 60 

aacaatggcc tgatgacgtt caacgcgacg ctgggcggcg ataattcacc caccgataag 120 

atgaacgtga aaggcgatac ccaagggaac actcgcgttc gggttgataa cattggcggc 180 

gtcggtgcgc aaacggtcaa cggtattgaa ctcattgagg ttggcggtaa ttctgcaggt 240 

60 aatttcgcgc tgaccaccgg aactgtcgaa gctggggctt acgtctacac gctggctaaa 300 

gggaagggga atgacgagaa aaactggtat ctgaccagta aatgggacgg cgtaacgcca 360 

gcggatacac ccgatcccat caataatccc cctgttgtgg atccggaagg cccatcagtt 420 

tatcgcccgg aggccggaag ctatatcagc aacattgccg cagccaactc gctgtttagc 480 

catcgcttac acgaccgtct gggtgagccg caatatacag attcactgca ttctcaggat 540 

65 tcagcaagca gtatgtggat gcgtcatgtc ggggggcacg aacgttccag tgccggagac 600 

ggccagctaa atactcaggc taaccgctat gtattgcagc taggcggcga tttggcgcag 660 

tggagtagca acgcgcagga tcgctggcat cttggcgtga tggcaggcta cgccaatcag 720 
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cacagtaata ctcagagtaa tcgtgtgggt 
agcgctgggc tgtacgcgac ctggtatcag 
gacagctggg cgctgtataa ctggtttgat 
gacgactatg attctcgcgg tgtgacggcc 
5 ggaacatgta gcggcagcga agggacgctg 
atcacctgga tgggtgtgaa agattctgac 
acggaaggcg acggaaacgt gcaaacgcga 
caccagcgtg acgatggtaa acagcgtgag 
aacaatagca aagtctacgc cgtgaagatg 

10 cgaaatctcg gtgaagtacg taccggggtt 
tgggggaatg tcggtgtgca actaggtgat 
ggagtgaaat atagctggta a 
<212> Type : DNA 
<211> Length ; 1401 

15 SequenceName : SEQ ID 406 

SequenceDe script ion : 

Sequence 



20 <213> OrganisraName : Escherichia 
<4 00> PreSequenceString : 
atgtcatatc tgaatttaag actttaccag 
catcgtttgg ctggtttttt tgtccggctc 
cctttgtcat ctgccgaact ctattttaat 

25 gtggccgatt tatcgcgttt tgaaaatggg 
gatatctatt tgaataatgg ttatatggca 
a-gtgaacaag ggattgttcc ctgcctgaca 
acggcttctg tcgccggtat gaatctgctg 
atggtccagg acgctactgc gcatttagat 

30 cctcaggcat ttatgagtaa tcgcgcgcgt 
ggtattaatg ccggattgct caattataat 
gggggtaaca gccattatgc atatttaaac 
cgtttacgcg acaataccac ctggagttat 
aafcaaatggc agcatatcaa tacctggctt 

35 ctgacgctgg gtgatggtta tacfccagggt 
gcacaattgg cctcagatga caatatgtta 
atccacggta ttgctcgtgg tactgcacag 
tataatagta cggtgccgcc ggggcctttt 
agtggtgact tgcaggtaac gattaaagag 

40 ccctattcgt cagtcccgct tttgcaacgt 
ggagaatacc gtagtggaaa tgcgcaacag 
ctccacggcc ttccagctgg ctggacaata 
cgtgctttta attttggtat cgggaaaaat 
atgactcagg ctaattccac acttcccgat 

45 tttctctata acaaatcgct caatgagtca 
tattcgacca gcggatattt taatttcgct 
aacatcgaaa cacaggacgg agttattcag 
ctcgcttata acaaacgcgg gaaattacaa 
tcaacactgt atttgagtgg tagccatcaa 

50 caattccagg ctggattaaa tactgcgttc 
ctgacgaaaa acgcctggca aaaaggacgt 
cctttcagcc actggctgcg ttctgacagt 
tacagcatgt cacacgatct caacggtcgg 
ttgctggaag acaacaacct cagctatagc 

55 ggtaatagcg gaagcacagg ctacgccacg 
aatatcggtt acagccatag cgatgatatt 
gtactggctc atgccaatgg cgtaacgctg 
gttaaagcgc ctggcgcaaa agatgcaaaa 
tggcgcggtt atgccgtgct gccttatgcc 

60 gataccaata ccctggctga taacgtcgat 
actcgtgggg cgatcgtgcg agcagagttt 
acgctaaccc acaataataa gccgctgccg 
cagagtagcg gcattgttgc ggataatggt 
ggaaaagttc aggtgaaatg gggagaagag 

65 ctgccaccag agagtcagca gcagttatta 



tataaatcgg atgggcgcat cagcggttac 780 

aacgatgcga ataagaccgg cgcttatgtt 840 

aacagcgtca gttccgataa ccgttctgct 900 

tctgttgagg gtgggtatac ctttgaagcg 960 

aatacctggt acgtccagcc acaggcgcaa 1020 

catgcccgga aagacggaac gcgcattgaa 1080 

cttggggtga aaacctacct gaatagccat 1140 

ttccagcctt acattgaagc gaactggatc 1200 

aatggtcaaa ccgtaagccg tgatggtgcg 1260 

gaggcgaaag taaataacaa ccttagcctg 1320 

aaaggctata gcgatactca gggcatgctg 1380 

1401 



coli 0157:H7 

cgaaacacac aatgcttgca tattcgtaag 60 

tttgtcgcct gtgcttttgc cgtacaggca 120 

ccgcgctttt tagcggatga tccccaggct 18 0 

caagaattac cgccagggac gtatcgcgtc 240 

acgcgtgatg tcacatttaa tacgggcgac 3 00 

cgcgcgcaac tcgccagtat ggggctgaat 360 

gcggatgatg cctgtgtgcc attaaccaca 42 0 

gttggtcagc agcgactgaa cctgacgatc 480 

ggttatattc ctcctgagtt atgggatccc 540 

ttcagcggaa atagtgtaca gaatcggatt 600 

ctacagagtg ggttaaatat tggtgcgtgg 660 

aacagtagcg acagatcatc aggtagcaaa 720 

gagcgagaca taataccgtt acgttcccgg 780 

gatattttcg atggtattaa ctttcgcggc 840 

cccgatagcc aaagaggatt tgccccggtg 900 

gtcactatta aacaaaatgg gtatgacatt 960 

accatcaacg atatctatgc cgcaggtaat 102 0 

gctgacggca gcacgcagat ttttaccgta 108 0 

gaagggcata ctcgttattc cattacggca 1140 

gaaaaacccc gctttttcca aagtacatta 120 0 

tatggtggaa cgcaactggc agatcgttat 1260 

atgggggcac tgggcgctct gtctgtggat 1320 

gacagtcagc atgacggaca atcggtgcgt 1380 

ggcacgaata ttcagttagt gggttaccgt 1440 

gatacaacat acagtcgaat gaatggctac 1500 
gttaagccga aattcaccga ctattacaac • 1560 

ctcaccgtta ctcagcaact cgggcgctca 162 0 

acttattggg gaacgagtaa tgtcgatgag 1680 

gaagatatca actggacgct cagctatagc 1740 

gatcagatgt tagcgcgtaa cgtcaatatt 1800 

aaatctcagt ggcgacatgc cagtgccagc 1860 

atgaccaatc tggctggtgt atacggtacg 1920 

gtgcaaaccg gctatgccgg gggaggcgat 1980 

ctgaattatc gcggtggtta cggcaatgcc 2040 

aagcagctct attacggagt cagcggtggg 2100 

gggcagccgt taaacgatac ggtggtgctt 2160 

gtcgaaaacc agacgggggt gcgtaccgac 2220 

actgaatatc gggaaaatag agtggcgctg 2280 

ttagataacg cggtcgctaa cgttgttccc 2340 

aaagcgcgcg ttgggataaa actgctcatg 24 0 0 

tttggggcga tggtgacatc agagagtagc 2460 

caggtttacc tcagcggaat gcctctagcg 2520 

gaaaatgctc attgtgtcgc caattatcaa 2580 

acccagctat cagctgaatg tcgttaa 2637 



<212> Type 



DNA 
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<211> Length : 263 7 

SequenceName : SEQ ID 407 
SequenceDescription : 



5 Sequence 

<213> Organ isiTiName : Escherichia coli 0157 :H7 
<400> PreSequenceString : 

atgcagataa tctttggaga aaaatgcgtg tcattactac gactafctttt tgccgccgtc 60 

10 ttaatgctat ggtgcgctca aaccgctgct tatagcgggc agtgtcatac cactcagggg 12 0 

aatccgtata ttggcgtcaa ttttggcgtt aaaaccctgg aggaagaaga aaatacgact 18 0 

ggggtagtaa aagacaaatt ttatcagtgg aacgaatcga atgattatta tgtttcctgt: 24 0 

gattgcgata aagacaatgt cagaagtggc cgatgggcat tcgccgcgga ttcaccgtta 3 00 

gtctatttag gcgacaactg gtacaaaatt aatgactatc ttcrccgccaa »agttttattg 360 

15 caggttaaag gcagttctcc tacagcggtt cctttcgaaa acgtggggac tggggcagat 42 0 

acccggtggc atatttgtga ccccggcggt caacgtttag gcggccaggg agctagcggt 480 

aatagcggta gcttttccct gaaaatattg cagccgttcg ttggttcggt cgtcattcct 540 

cctatggcgc tggcgcgatt atttgaatgc tacaacatac ccgcaggtga ttcctgcacg 600 

actacaggca caccggtttt agtgtattac ctgtctggta ctatcaattc acttggctca 660 

20 tgttccgtca atgccggaga aacaatcgag gtcgatctgg gcgacgtatt tgcggctaac 720 

tttcgtgttg tagggcataa gcctcttggg gccagaacgg cagaacttgc aattccagtc 7 80 

aggtgtaaca cgggaaacgc ggggttagtt aacgtcaacc tgagtctgac ggcaaccaca 840 

gaccccagct atccccaggc gattaagacg tcacgtcctg gcgtgggcgt ggfcggtgacc 90 0 

gatagccaga acaacattat ttcccctgct ggtggaacat taccgctctc tattcctgat 960 

25 gatgcagaca gtatcgcgtg a 981 
<212> Type : DNA 
<211> Length : 981 

SequenceName : SEQ ID 408 
SequenceDescription : 

30 

Sequence 



<213> OirganistnName : Escherichia 
<400> PreSequenceString : 

35 atgaaaatta aaactctggc aatcgttgtt 
gctctggccg ctgccacgac ggttaatggt 
aacgccgctt gcgcagttga tgcaggctct 
cgtaccgcat cgctggcaca ggacggagca 
ctgaatgatt gcgataccaa tgttgcatct 

40 attgatgcgg gtcataccaa cgttctggct 
aacgttggtg tgcagatcct ggacagaacg 
ttcagtgagc aaacaaccct gaataacggt 
tatgcaatcg gcgaggcaac cccgggtgct 
tatcaataa 

45 <212> Type : DNA 

<211> Length : 549 

SequenceName : SEQ ID 409 
SequenceDescription : 

50 Sequence 



coli 0157 :H7 

ctgtcggctc tgtccctcag ttctacagcg 60 

gggaccgttc actttaaagg ggaagttgtt 120 

gttgatcaaa ccgttcagtt aggacaggtt 180 

accagttctg ctgtcggttt taacattcag 240 

aaagccgctg ttgcc-ttttt aggtacggtg 3 00 

ctgcagagtt cagctgcggg tagcgcaaca 360 

ggtgctgcgc tgacgctgga tggtgcgaca 42 0 

actaacacca ttccgttcca ggcgcgttat 48 0 

gctaatgcgg atgcgacctt caaggttcag 54 0 

549 



<213> OrganisTtiName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

atgaaattaa aagtcatcgc tacactgatt gctactgttg ccgtgggtgt aagctttaac 60 

55 agcaattttg cttctgcgag tacaacgtcc gcttctttaa ccgtaaacag taacctgact 120 

atgggtacct gcagtgctca gataatggat aatagtaata aagtgatcaa tgaagtggtc 180 

tttggcaatg tttatatttc tgaactcggt gcaaaaagca aagtgcaaca gtttaaaatt 240 

cgctttagca attgctctgg ccttccccaa aacagcgccc aaatagtgct ggcacctaat 3 00 

ggtatatcct gtgctggttc tcaatcgtca tcggcgggtt tttctaacaa gtttactgac 3 60 

60 gctagcgcag caaccagaac ggctgtggaa gtatggacta cagatacacc ggaaagcaat 42 0 

ggcagtacgc aattccattg tgctcaaaag ataccagtgc ctgtgacgct tcccgccgac 480 

accacaactc agccttacga ttacccgtta agtgcacgga tgaccgttgc ggaaggtaga 540 

ttggtaaccg atgtaagacc gggtaatttc cgctctccca cgactttcac gatcacttat 600 

cagtaa 606 
65 <212> Type : DNA 

<211> Length : 606 

SequenceName : SEQ ID 410 
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SequenceDescription 



Sequence 

5 <213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

ttggcatcaa cagttgagta tggtgagaca gttgatggtg ttgtcctgga aaaagatatc 60 

cagctggttt atgggaccgc caataatacg aaaatcaatc ctggcggaga acagcatatt 120 

aaagaatttg gtataagtag taatactgaa attaacggcg ggtatcagta cattgaaatg 18 0 

10 aatggcaccg cagaatactc agtattaaat gatggttatc aaattgttca aatgggtggc 240 

gcggcaaacc agactacgct caataatggt gtgctacagg tttatggcgc agcgaatgat 3 00 

cccacgatta aaggcgggcg cttaatcgtt gaaaaagatg ggattaccgt ccttgccgct 360 

atcgaaaagg gaggattact ggaggttaaa gaggggggafc tagcgattgc ggtagatcag 420 

aaagcaggcg gtgctattaa agcaagcacg cgggtcatgg aggtattcgg aacaaaccgt 480 

15 ctcggtcagt tcgaaatcaa gaatggtatt gctaacaata tgctgttgga aaacggcgga 540 
agtttgcgag ttgaagaaaa tgacttcgct tataatacta ctgtagatag tggcggctta 600 
ctggaggtta tggatggcgg gactgcaact ggcgttgata aaaaagcagg cggaaaatta 660 
attgtctcaa cgaatgcgct ggaagtgagt ggtacaaaca gtaaaggcca atttagtata 72 0 

aaagatggtg tgtcaaaaaa ttatgaactg gatgatggtt ccgggcttat tgttatggag 780 

20 gacacgcagg ccattgacac tatcctcgat gagcatgcca ctatgcaatc gctgggaaag 840 
gatactggta cgagagtgca ggcaaatgcg gtatatgatc tcggtcgatc agatcagaat 900 
ggaagtataa cgtattcctc taaagccatc tctgaaaata tggttatcaa caatggccgc 960 

gctaacgtct gggctggcac aatggttaac gtgtcagtca gaggaaatga tggcattctt 102 0 

gaggttatga agccgcaaat aaattatgca cccgcaatgt tggtgggtaa ggtagtggtt 1080 

25 tctgagggcg cttctttaag aacgcatggt gccgtggata ccagcaaagc ggatgtttcg 1140 

Gtcgaaaata gcgcatggac catcattgcc gatatcacta cgacgaacca aaacacccgc 12 0 0 

cttaacttag ccaaccttgc gatgtctggc gcaaatgtga ttatgatgga tgagtcagtg 12 60 

actcgttcat ctgtgacggc aagtgcggaa aatttcacta cgttgaccac caataccctg 13 20 

tcgggaaacg gcaatfcttta tatgcgtacc gatatggcga atcatcagag cgatcagctc 13 80 

30 aacgtcaccg gtcaggcaac aggtgatttc aaaatattcg tgacggacac cggtgccagc 1440 

ccggcagcag gagatagcct tacactggta acaacgggcg gcggtgatgc tgcatttacg 1500 

ttgggcaatg ccggaggcgt tgttgatatc ggtacgtatg aatatacctt gctggataat 1560 

ggtaaccata gctggagtct ggcagagaat cgcgcgcaaa ttaccccttc aaccactgat 1620 

gtgctgaata tggcggccgc acaaccgctg gtatfctgatg cagaactgga caccgtgcgt 1680 

35 gagcgtcttg gtagcgtaaa aggcgttagt tacgatacgg cgatgtggag ttcggcaatt 1740 

aacacccgca acaacgtgac cactgatgcg ggagctggtt ttgagcaaac attgacgggc 1800 

ctgacgctcg gtatcgatag ccgtttctcc cgtgaagaaa gcagcacaat tcgcggcttg 18 60 

tfcctttggtt actctcatfc tgatattggt tttgatcgcg gcggcaaagg caatgtcgat 1920 

agctataccc tgggggctta tgccggttgg gagcatcaga acggtgccfca tgttgatgga 1980 

40 gtggtgaaag ttgaccgttt tgccaacacc atccatggca agatgagtaa tggggcaaca 2040 

gcgtttggcg attacaatag taacggcgcg ggtgctcatg tcgagagcgg gttccgttgg 2100 

gttgacggat tgtggagtgt tagaccctat ctggccttta ccggctttac cacagatggt 2160 

caggactaca cgttatcaaa cggcatgcgc gcggatgtgg gaaatacccg gatattacgc 2220 

gctgaagcgg gaacggcggt aagctatcac atggacctgc aaaacggtac gacgctggaa 22 80 

45 ccctggctga aagccgccgt gcgtcaggaa tacgccgatt ctaaccaggt gaaagttaat 2340 

gacgatggca aatttaataa tgatgtggct ggaacccgtg gcgtttatca ggctgggata 2400 

aggtcatcgt ttaccccgac gttaagcggt catttgtcag tcagctatgg caatggcgca 2460 

ggggtagaat cgccgtggaa tacccaggcg ggtgtggtct ggacgttctg a 2511 

50 <212> Type : DNA 

<211> Length : 2511 

SequenceName : SEQ ID 411 
SequenceDescription : 



55 Sequence 



60 



65 



<213> OrganismName : Escherichia 
<400> PreSequenceString : 
atgcaaagga aaggcaataa actgttgatt 
accacatcct ggtatgcatt ggcgaatgaa 
tatcacatga agataagctc tactcagctt 
acagaaatag ccgaagctac atgggatgta 
tgtaaatctc ttggggatag taaggcagtt 
atatccacgt acaccacaac gaatggcgca 
gtgtattctg tcgagttatt atgccttagt 
ctacctgcac aaagtggcgc agataacttc 
gagtacagtg atcaaagttg gtatttacgt 



coli 0157:H7 

cagttatgca gtgtgatact 
tgttatatag agagaaatgc 
agtctggcgt cacaaatggt 
aatattcaac taagaggcga 
cactttctta atacagctga 
gcgttattaa aaacaactgt 
tgtggtgccg cagatgaact 
ataccaagca cccagacgaa 
tttcgcttat tcataactcc 



gctatttttt 
tgaaggggat 
cgaggttccg 
tgccataggg 
cccaagttta 
tccaggcatt 
tgatttatgg 
atgggcctat 
tgaatttaaa 



60 
120 
180 
240 
300 
360 
420 
480 
540 
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cccaagaatg gtgtttccag cggaacaacg atagcaggaa agattgcgtc atggtatata 600 

. ggtaccaatg accagccgtg gatcaacttt tacattgaca atgactcttt aaagtttttc 660 

gtcgatgaac cgacctgtgc aacagttgcc ctggcacaag atcagggcaa cgtcagtggc 72 0 

aatcaggtaa cgcttgggaa cagctatgtt tcggaagtga aaaatgggct tacgcgggaa 78 0 

5 atcccttttt ctatccgtgc tgaatactgt tatgccagta aaattacggt taagttgaaa 840 

gcggcaaata aacccagcga tgccacactg gtgggtaaaa cgactggctc ggcttcaggc 9 00 

gtggctgtaa aagtaaattc aacttatgac aatagcaaag tattgttaaa agcagatggt 9 SO 

agcaacacgg ttgactacaa cttcgccgcc tggtcaaaca acctgctgtt tttacctttt 1020 

acggcgcagc tggtaccgga tggtagcggt aatgctgtcg gtgttggaac attttcaggt 1080 

10 aacgcgacct tctcctttac ctacgaataa 1110 
<212> Type : DNA 
<211> Length : 1110 

SequenceName : SEQ ID 412 
SequenceDescriptioii : 
15 ' 
Sequence 



<213> OrganistnName : Eschericliia coli 0157 :H7 
<400> PreSequenceString : 

20 ttgtaccagt ttactcatca aaaaagccgt atcccgaaaa aaacgctact tgcggcctgt 60 

tgtgccctgt tttatagcag caacggtgct gcggcggaca ccgtggaata tgacagttcc 120 

tttttaatgg gaactggcgc atcaacgatt gatgttaaac gttatgctca aggcaacccg 18 0 

acaccgccgg gtctctataa tgtccgcgta tttgtaaacg gtcaggcgac ttccagctta 24 0 

gaaattccgt ttgtggatat tggcgaaaac agtgcggcgg cctgtcttac ccataaaaac 3 00 

25 ctggcgcaac ttcacattaa gcaacctgaa cagcctgtca ctttactcgc cagagaaggt 3 60 

gaagaagagg attgtctgga tctggcaaag tcatacgaaa aggcggatgt gtgctttgac 420 

ggtagtgacc agtttctcga tctgacgatc cctcaggcct atgttctgaa aagctatggc 480 

ggctacgttg acccttcttt atgggaatcg ggaattaacg ctgccacact ggcatatacc 540 

ctgaacgcgt atcacacaag ttcagataac gacaatagtg acagcgtcta tggcgcgttc 60 0 

30 aactcaggta tcaatttagg agcctggcac tttcgtgcgc gcggtaacta taactggaca 660 

acagataacg gcagcgattt cgatttccag gatcgttact tacagcgtga cattccggca 72 0 

atccgttccc agataattat gggtgatgcc tataccaccg gtgaaacgtt tgactctgtc 780 

aacgtccgtg gtgttcgcct gtacagcgac agccgtatgc tgccttcggc gctggccagt 84 0 

tacgctccga ccatccgcgg tgtagcaaac tccaacgcca aagtcaccgt gacgcaaagc 900 

35 ggatataaaa tttatgaaac caccgttccg cccggtgaat ttgttataga cgacattagc 960 

ccttccggct ttggtagcga actggtcgtg accattgaag aagcggatgg ttccaaacgc 1020 

acctttacgc aacccttctc gtcggttgta caaatgcaac gtcctggtgt gggccgttgg 108 0 

gatttcagcg cgggtaaagt cattgatgac agtctgcgat ccgaacccaa tatggggcaa 1140 

gcctcttatt actatggtct gaataacctc ttcacgggtt ataccggcat tcagttcacc 12 00 

40 gataataact atcttgccgg gctgttaggt gtgggtatca acaccagcat cggcgccttt 1260 

gcggtagacg ttacccattc ccgtgctgaa attccggatg ataaaaccta ccaggggcaa 1320 

agttatcgcg tgacctggaa caaacttttc caggataccg ggacatcatt taacctcgcg 13 80 

gcgtaccgct attccaccca ggattacctg ggcctgcatg atgcgttagt cctcattgac 1440 

gacgccaagc atttgtctgc cgatgaagac aaaaacacca tgcagacgta ctcacgtatg 1500 

45 aaaaaccagt ttaccgtcag cattaaccag ccattgaata tcgcctatga agattacggt 1560 

tcgctgttta tttccggtag ctggacgtat tactgggcgg cgaacaatag ccgcactgaa 1620 

tataatgttg gttacagtaa aagcgtttcg tggggcagtt tcagcgtcaa cctacaacgt 1680 

agctggaatg aagacggcga gaaagatgac gcgatgtacg tcagcgttag cgtacctatt 1740 

gagaatattt taggtggcaa acgtaagtct tctggtttcc gcaatttaaa tactcagctc 1800 

50 aataccgatt tcgatggttc acatcagttg aatgttaaca gttccggtaa cactgaaaac 1860 

aatctggtga actacagtgt caacgcaggt tatagcctcg ataaaaacgc cggcgattta 1920 

gcctctgttg gtggttatct caactatgaa tctgggttag gcggtatttc cgcttcggcc 1980 

tcggccactt ctgataacag ccaacagtac tccatctcaa ccgatggcgg ctttgtatta 2 040 

cacagtggtg gtttaacgtt cactaacaac agtttcagca gtaacgacac gctggtgtta 2100 

55 atcaacgccc taggtgctaa aggcgcacga atcaataaca gtaataacga aatcgatcgc 2160 

tggggatatg ccgtgacgtc ctctgtcagc ccatatcgtg aaaaccgggt aggtctgaac 2220 

attgaaacac tggaaaacga tgttgaactg aaaagtacca gcgccaccac cgtaccacgt 22 80 

agcggctccg ttgttttgac ccgtttcgaa actgacgagg ggcgttctgc cgtgctgaat 2340 

attactgccg ccaatggcaa atccattccg tttgctgcgg aggtttacca gggtgaggtg 2400 

60 atgatcggca gcatgggcca gggtggtcag gcatttgtac gcggtattaa cgacagcggg 2460 

gaattaatcg tgcgctggta tgaaaacaac caaaccattg actgtaagtt gcactaccag 2 52 0 

ttcccggcgc agccacaaac gcagggaagc accaacacct tattacttaa caatcttacc 2580 

tgtcaggtag caaatcacta a 2601 
<212> Type : DNA 

65 <211> Length : 2601 

SequenceName : SEQ ID 413 
SequenceDescription : 



wo 2005/076010 



133/341 



PCT/IN2005/000037 



Sequence 



<213> OrganistnName : Escherichia coli 0157:H7 
5 <400> PreSequenceString : 

atgaagttca aacgattgct gcatagcggc atcgccagtt tgagtctggt tgcctgcggg 60 

gtgaatgcgg cgacggatct tggcccggca ggggatattc atttctccat cactatcacc 120 

actaaagctt gcgagatgga aaaaagcgat ctcgaagtcg atatgggaac aatgacgctg 180 

caaaaacctg cggcagtcgg tacggtgttg agcaagaaag atttcaccat tgaactcaaa 240 

10 gagtgcgatg ggatatccaa agcgaccgtt gagatggaca gtcagtcgga cagcgatgat 3 00 

gattccatgt ttgcccttga ggctggtggc gcaacgggtg ttgcgttgaa gatagaggac 360 

gataaaggaa cgcagcaagt tcccaaaggc tccagcggaa cgccgattga atgggcgatt 420 

gatggcgaaa ccacgtcgct tcactaccag gcgagttatg tggtcgtcaa cactcaggcc 480 

actggtggca cagcgaatgc ccttgtaaat ttttccatca cctatgagta a 531 



15 

<212> Type : DNA 

<211> Length : 531 

SeguenceName : SEQ ID 414 
SequenceDe script ion : 

20 

Sequence 



<213> OrganisniName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

25 atgaaataca ataacattat tttcctcggt ttatgtctgg ggttaaccac ctattctgct 60 

ttatccgcag atagcgttat taaaattagc gggcgcgtcc tcgattatgg ctgcacagtc 120 

tcatcggatt cgcttaattt taccgtagat ctccaaaaaa acagtgccag acaatttcca 180 

acgaccggta gcacaagtcc agccgtccct tttcagatta cgttaagtga atgcagcaaa 240 

gggacaacgg gggttcgggt tgcatttaac ggtattgagg acgcagaaaa taatactctg 3 00 

30 ttgaaactgg atgagggaag caatacggcc tccggtttag gtatagaaat actggacgga 3 60 

aatatgcgtc cggtgaaact gaatgacctt catgccggga tgcagtggat cccactggta 420 

ccagaacaga acaatatttt gccttactcc gctcgtctga agtcaactca gaagtccgtc 480 

aatccgggac tggtgagggc ttcggcaacc tttacccttg aatttcaata a 531 



35 <212> Type DNA 

<211> Length : 531 

SequenceName : SEQ ID 415 
SequenceDescription : 

40 Sequence 



<213> OrganisTriKTatne : Escherichia coli 0157:H7 
<400> PreSequenceString : 

atgaaatggc gcaaacgtgg gtatttattg gcggcaatat tggcgctcgc aagtgcgacg 60 
45 atacaggcag ccgatgtcac catcacggtg aacggtaagg tcgtcgccaa accgtgcaca 120 
gtttccacca ccaatgccac ggttgatctc ggcgatcttt attctttcag tctgatgtct 180 
gccggggcgg catcggcctg gcatgatgtt gcgcttgagt tgactaattg tccggtggga 240 
acgtcaaggg tcactgccag cttcagcggg gcagccgaca gtaccggata ttataaaaac 3 00 

caggggaccg cgcaaaacat ccagttagag ctacaggatg acagtggcaa cacattgaat 360 
50 actggcgcaa ccaaaacagt tcaggtggat gattcctcac aatcagcgca cttcccgtta 420 
caggtcagag cattgacggt aaatggcgga gccactcagg gaaccattca ggcagtgatt 480 
agcatcacct atacctacag ctga 504 
<212> Type : DNA 
<211> Length : 504 
55 SequenceName : SEQ ID 416 

SequenceDescription : 

Sequence 



60 <213> OrganisiriNarae : Escherichia coli 0157:H7 
<400> PreSequenceString : 

atgaaaagag cgcctcttat aacaggactt ttgttgatat ccacatcctg cgcttatgcc 60 

tcctcagaag ggtgtggagc tgacagcact agcggtgcga caaattacag cagtgtggtt 120 

gatgatgtta cggtgaacca gacagataac gtgacaggac gggagtttac ctctgcaacg 180 

65 ctaagtagca ctaactggca atacgcctgt tcctgctctg cgggtaaggc agttaaactt 240 

gtctatatgg tcagccccgt acttaccacc actggacatc agacaggata ttacaaactc 3 00 

aatgacagcc tggatattaa aaccatgaac cgccccggaa atcctggaga ctaa 354 
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<212> Type .- DNA 

<211> Length : 354 

SequenceNatne : SEQ ID 417 
SequenceDescription : 

Sequence 



<213> OrganistnName : Escherichia coli 0157 :H7 

10 <400> 'PreSequenceString : 

atgaaaaaag cacttctcgc agccgctctg gttatggctt ctggttccgc cctggctgta 60 

gatggtggtc atatcgactt taacggtatg gtacagtccg gtacctgtaa agtgggtgtg 120 

gtagatactg gtatgcatag cgttaccact gatggcgtgg ttaccctgga tactgcgaat 180 

gttactgata cttttgctga agttagcgca actgctgtcg gtttactgcc gaaagagttc 24 0 

15 atgatttctg ttgagtgtga tccaggtgct ccgaagaatg ctgagttaac tatgggttct 3 00 

gcaagttacg cgaacaccag cggtaccctg aataacaata tgaacatcac tgttaacggt 3 60 

attgcaccgg ctcagaacgt aaacattgca gttcataaca tgaaaaacaa agctggcgct 42 0 

gctgaaatta agcaggtcca tatgaacaac tcttctgaag ttcaggaact gacattagac 4 80 

gcagaaggta aaggccagta cgtafcttaac gcatcttacg ttaaagcacc gaacagcccg 540 

20 gctgtaactg ctggtcatgt: aaccactaac gcgctgtaca ccgttgctta taagtaa 597 

<212> Type : DNA 
<211> Length : 597 

SequenceName : SEQ ID 418 
25 SequenceDescription : 

Sequence 

<213> OrganistnName : Escherichia coli 0157 :H7 

30 <40 0> PreSequenceString : 

atgaaaccaa atatgattgt aggagcatta gcgttaactt ctgtgtttat ggcaggtcac 60 
ctacaggcgg ctgatggaac agtccatttc cgtggtgaaa ttattgacag tacttgcgaa 12 0 

gtcactcctg aaactaaaga tcaggtcgtt gatttaggca aagtaaaccg tacagccttt 180 
agtggcgtcg atgatgtggc tgccccgacg gctttttcta tcgatctgac tcaatgcccg 240 

35 gaaaccttta agtccgccgc aattcgtttc gatggtaatg aagatgctca tggtaatggc 3 00 

aacctggcaa ttggtacccc gctggataac tctaacgatg ctgccgctgg tattagcccg 3 60 

agtgataaca gtggggatta tactggtgcg ggtgccgtta gtgcagcgaa aggcgtagct 420 
attcgtttat ataaccgtgc agataacact caggtcaagt tatatgaaaa ttctgcatca 480 
actccgattt ctaatggtaa tgcatccatg aagttcatgg ctcgttatat tgctacggaa 540 

40 acgactattg accctggtac agctaacgcc gactcgcagt ttacagttga atatataaaa 600 
taa 603 
<212> Type : DNA 
<211> Length : 603 

SequenceName : SEQ ID 419 

45 SequenceDescription : 

Sequence 



<213> OrganismName : Escherichia coli 0157 :H7 

50 <400> PreSequenceString : 

gtgccaattt tccagcgtga aggccatctc aaatatagct ttgccgcagg tgaatatcag 60 

gccgggaatt atgacagcgc ctcgccgcgt ttcgggcagc ttgatctgat ctacggttta 120 

ccgtggggga tgacggccta cggcggcgta ttaatctcta ataattacaa tgcatttaca 180 

ttagggatag ggaaaaactt tggttatatc ggggcgattt ccattgatgt gacgcaggct 240 

55 aaaagcgaac tgaataacga tcgcgatagc cagggacaat cttatcgttt cttatattcc 3 00 

aagagcttcg aaagcggcac cgatttccgc cttgcgggct atcggtactc taccagcggt 3 60 

ttctatacct tccaggaagc caccgatgtg cgcagtgacg ctgacagcga ctataaccgt 42 0 

tatcacaagc gcagcgaaat acagggtaac ctgacgcagc aattaggggc ctatggctct 48a 

gtttatttaa atttaacgca gcaggattac tggaacgacg caggtaaaca gaacacggta 540 

60 tcggcgggtt acaacggacg tattggcaag gtcagttaca gtattgcata tagctggaat 600 

aaaagccctg aatgggatga aagcgatcgc ttgtggtctt tcaatatttc cgttccacta 660 

ggccgggcct ggagtaacta tcgcgtcacg accgaccagg atggtcgtac caatcaacag 72 0 

gttggggtca gcggaacgct gcttgaggat cgcaacctga gctacagtgt ccaggaaggc 780 

tacgccagca acggtgtggg taacagcggt aacgctaacg ttggctatca gggtgggtcc 840 

65 ggtaatgtca acgtaggcta tagctacggg aaagattacc ggcagctcaa ctacagcgtt 900 

cgcggcggcg tgatagttca tagcgaaggc gtgacgcttt cccaaccgct aggcgaaacc 960 

atgacgctca tctccgtacc cggtgcgcgc aatgcccgcg tggtgaataa cggcggcgtt 1020 
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caggttgact ggatgggtaa cgcgatcgtg ccttatgcca tgccgtatcg tgaaaacgaa 10 8 0 

atctcactgc gtagcgattc gttgggtgac gatgttgacg ttgaaaatgc gttccagaaa 1140 

gtggtgccaa cgcgtggagc gattgtcaga gcgcgttttg atacccgcgt tggttaccgc 1200 

gtattaatga cgctgcttcg ttccgcgggc agcccggtgc cctttggagc aacggcaacg 12 60 

5 ctaatcaccg ataaacaaaa cgaggtgagc agtatcgttg gtgaagaagg acagctctat 1320 

attagcggaa tgccagagga aggacgggta ttgattaaat ggggtaatga cgcgtcgcag 13 80 

caatgcgtgg cgccttataa attatccctg gaattaaaac agggcggaat tattcctgtt 1440 

tcggccaatt gccagtaa 1458 
<212> Type : DNA 
10 <211> Length : 1458 

SequenceKTarae : SEQ ID 420 

SequenceDescription : 



Sequence 
15 

<213> OrganistuKTame : Escherichia 
<400> PreSequenceString : 
atgagtggtt acaccgtcaa gcctcctacc 
gattatttta atctgttcta cagtaagcgt 

20 cttggaaatt acggtgcgac atttttcagt 
cgcagcgacc agcaaatatc atttggatta 
ctgaattaca gctattccaa taatatatgg 
acgcttaatg ttcccttcag tcattggatg 
tcaaacgcca gttacagtat gtcaaacgat 

25 gtttatggca ctctgctgcc ggataataac 
cacggaggta atacatcgtc tggcaccagt 
tacggcaata ctaatgtcgg ttacagtcgg 
atgagtggtg ggattattgc tcatgctgat 
acaatggttc tggttaaggc tcctggcgct 

30 attcataccg actggcgtgg ctatgccata 
cgtgtcgctc ttaacgcgaa ttcccttgca 
actgtcatcc caactcacgg tgctattgcc 
aaagtattaa tgacgttgaa gtacggtaat 
cacggagaga ataaaaatgg cagcattgtc 

35 cttccacagt cagggaaatt acaggtttca 
gtcgattaca agcttcctga agtctctcct 
tgtcgctaa 
<212> Type : DNA 
<211> Length : 1149 

40 SequenceNatne t SEQ ID 421 

SequenceDescription : 

Sequence 



coli 0157:H7 

ggagacagca atgagcagac acaatttatt 60 

gatcaggaac aaataagcat ctctcagcag 120 

gccagtcgcc aaagttactg gaacacgtca 180 

aatgtgccgt ttggtgatat tacgacttcg 240 

caaaacgatc gggatcattt actcgctttt 3 00 

cgtacagaca gtcagtcggc atttcgtaat 3 60 

ttgaaaggcg gcatgaccaa tctatcgggg 420 

ctgaattata gcgttcaggt cggtaacacc 48 0 

ggttacagta ctcttaatta tcgtggagct 540 

agtggtgaca gcagccagat ttattacgga 60 0 

ggcatcacct ttggacagcc gctgggcgac 660 

gataatgtca aaatagagaa ccagaccgga 72 0 

ttaccatttg cgacagaata tagagaaaat 780 

gataatgttg aactggatga aaccgtggtc 840 

agagcaacat ttaatgcaca aatcggcggg 900 

aaaagcgttc cattcggtgc aattgtcact 960 

gcggaaaacg gtcaggttta tctgactgga 1020 

tggggcaatg ataaaaactc aaactgtatt 1080 

ggaaccttgc tgaaccagca gacagcaatc 1140 

1149 



45 <213> OrganismName : Escherichia 
<400> PreSequenceString : 
atgtctgctt tgtatgaacg ttcacagctg 
actgctgaaa ccatggagaa ggcggaatat 
cagttcaccg ccggtcagaa acaggatatt 

50 gagaacatca acggtctggg ggcgtcgtcc 
aatcaggccc agaacgccct gcgtgatgcc 
gtgcagtttc cgtccggtaa gggctttaag 
tcatccggta ccaacggcgt ggtggctgca 
gtgtcctatg tggtaccgct ggcgtttgtg 

55 accggtgcgc tgctgacaat gtcagtcagt 
gcctggaaga aggatggtca gccggtagag 
ggtgcgcagt caggtgataa gggggcttat 
ccgcagagca ttacctctga tgcgtgtaca 

60 <212> Type : DNA 

<211> Length : 717 

SequenceName : SEQ ID 422 
SequenceDescription : 



coli 0157:H7 

acgcaggtga tgatttcatc tgccccggcg 60 

ctgcgcctgg actgcaccat caaggaagtc 120 

gatgtgacca cgctctgctc cacagagcag 180 

gagatttcca tgtcgggtaa tttttatctg 240 

tatgacaatg acacggtgta tgcgtttaag 3 00 

ttcctggcgg aagtgcgtca gcacacctgg 3 60 

acgttttcac ttcgcctgaa gggtaaaccg 420 

aaaaatctgg ataagacact taccgtgaat 4 80 

gtcaacgggg gaacgccgcc ttataaacac 540 

ggacagacta ctgacacttt cagtaagcca 600 

acctgcgagg taacggattc tgcagaacag 660 

gtaacggtta atggtgcggg cggataa 717 



65 Sequence 



<213> OrganistriNarae : Escherichia coli 0157:H7 
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10 



<400> PreSequenceString : 

atgagaaaca aaccttttta tcttctgtgc gcttttttgt ggctggcagt gagtcacgct 60 

ttggctgcgg atagcacgat tactatccgc ggctatgtca gagataacgg ctgtagtgtg 120 

gccgctgaat caaccaattt tactgttgat ctgatggaaa acgcggcgaa gcaatttaac 180 

aacattggcg cgacgactcc tgtcgttcca tttcgtattt tgctgtcacc ctgtggtaac 240 

gccgtttctg ccgtaaaagt tgggtttacc ggcgttgcag atagccacaa tgccaacctg 300 

cttgcacttg aaaatacggt gtcagcggct tcgggactgg gaatacagct tctgaatgag 360 

cagcaaaatc agatacccct taatgctcca tcgtccgcga tttcgtggac gaccctgacg 42 0 

ccgggtaaac caaatacgtt gaatttttac gcccggctaa tggcgacaca ggtgcctgtc 48 0 

actgcggggc atatcaatgc cacggctacc ttcactcttg aatatcagta a 531 



15 



<212> Type : DNA 

*<211> Length : 531 

SecEuenceName i SEQ ID 423 
SequenceDescription : 



Sequence 

<213> OrganistnName : Escherichia coli 0157:H7 

20 <400> PreSequenceString : 

atgaataaat ccgttgtgtc aatttctgcg gcaatgttgg 
atggggagcg aaatctcacc cgcaacaccg tcagatgaag 
caactcttcc gcggcagcag atttagtcag tcgtcattag 
tctgttgcac cgggcaatta taaaatggat atctacacca 

25 tggaatgtca cgtttaaaga agccgctgat ggtcgcgttc 
gtcgcggacg cgataggcct caaaacaggg gaagataagg 
acgtttgcta aggaactcgc tcccggcatc accagccaga 
ctggacttat cggtgccaca gagtcaattg attagtcgcc 
agcgagctgg ataccggagc atcgctggcg ttcatgaatt 

30 gttgcctatt cagggcagaa tgctcatagc cagcgttcgc 
ggcatcaacc ttggtgcctg gcaatatcgt cagttatcca 
aaagggaatc agtggaacaa tattcgtagc tatttgcaac 
agccagttaa tgatggggca gcttatcacc agcggaagat 
cacggcgtta gtctcgcgac cgatgaacgt atgctgccgg 

35 ccgactattc gcggcgtggc cgcaacaaac gccagagtct 
gaaatatatc agaccaccgt ggctcctggc cctttcgaga 
agctacagcg gcgatctgga tgtcaccgtt acggaagcta 
agtgtcccct tttcagccgt accagaatcg atgcgtccag 
gaagtaggta aaacgcagga tagtggtgat gactcgatgt 

40 cacgggatga ctaatacgct gacatttaac agtggttcgc 
gcgctgatgc tgggcggagt ctatggcagt tcgctggggg 
tggtcccatg cgcgtgttcc cgaaagcgaa gcgcagagtg 
tggagtaaaa ctttccagcc tacttcaacc accgtctccc 
accagcggct atcgtgatct ggctgatgtg ctgggagagc 

45 cagtcatggg actccagcca gtggcgtcaa cagtcgcgct 
agccttgcga attacggcaa cctgtttgtg tcaggttcaa 
aagagccgtg atacacagct tcagttaggt tacagcaata 
atgaaccttt ccgtcggacg ccaaagaatg ggcggctata 
cagacggtaa catccctttc attctcattc ccacttggcg 

50 agtcttagca acagctggac ccattcaact gacggtagct 
accggaatgc ttgatgaagc acagaccacc aactacagcc 
caatataagc agacgacgct tagcggaaac atgcaaaaac 
ggattgaacg catcgaaggg ccaggattac tggcaggctt 
atggctgtgc atggtggcgg cattactttc ggaccttatc 

55 gtcgaagcta aaggcgcaga aggtgcaaaa gtctataact 
gacagtggct atgcgcttgt tccggcagta acgccctatc 
gatccacaag gaatggatgg cgatgccgag ttggtcgaca 
gttgcgggtg cggcggtgaa agtaattttc cgtacccgtc 
aaatcccgca tggcagatgg ttcggaactg ccaatgggag 

60 aatacagtcg tcggtatagc cggtcagggg gggcaaattt 
aaaggccact tgtcagttcg ctggggtgaa ggtgctaacg 
gatatcagcg ggaaggacag caatagccct atcatccgcc 
tga 

<212> Type : DNA 
65 <211> Length : 2523 

SecfuenceName : SEQ ID 424 
SequenceDescription : 



ttttactttg 
acaactacac 
caaaactgac 
acaataagtt 
tgccctgcct 
gggaaaaaga 
cacagttgtc 
ctcgcggcta 
atattgccaa 
tatgggcatc 
acatgacctg 
gcccgctgcG 
ttttctctgg 
actccatgcg 
cggtaatgca 
taaacgacct 
acggcgcagt 
gaacttcccg 
ttggtgacct 
gtatcgctga 
catttggggc 
gttggatgtc 
tggcaggtta 
gtcatgctgc 
tcgatcttac 
cacagaacta 
gctttagcca 
aagacaattc 
gcaatggacc 
cgcaattaca 
tgaacgtcat 
gtttttcaca 
caggtaacgt 
tgggtgaaac 
ccagtcagct 
gctacaaccg 
gtgaaagaca 
ctggtaaagc 
ccgatgtgct 
acctccgcac 
atagctgcca 
tgaatgaaac 



ccaaccggtc 

ctttgacccg 

aacacgtgag 

gtcaggcagt 

gacgcctgaa 

tcctgtctgt 

acaattgcgc 

tgttcccccc 

ctattacaac 

atttaatggt 

ggataatgac 

cgccataaat 

actcagttat 

cggctatgcg 

aaacggtcat 

ataccccacc 

cagtcgtttc 

ttataacgtg 

tacctggcag 

tggctaccag 

aaacctcact 

gcaattaacc 

tcgatactct 

cagcaataaa 

gttaagtcag 

ccgtggcggc 

tggcatcagt 

tgatgatatg 

tcgtgtacca 

aagctcgcta 

gcgcgatcaa 

aactaccgtc 

acaaggcgcg 

gttcgccctg 

ggaaattaat 

tatatctctc 

ggtagcaccg 

gttgctgatt 

ggatgagaat 

agaacagaca 

attgcccttt 

ctgtcagtct 



60 
12 0 
180 
240 
300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2523 
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Sequence 



<213> OrganisraName : Escherichia coli 0157:H7 
5 <400> PreSequenceString : 

atgaaactcg ccgcctgfctt tctgacactc cttcctggct tcgccgttgc cgccagctgg 60 

acttctccgg ggttccctgc ctttagcgaa cagggaacgg gaacatttgt cagccacgcg 120 

cagttgccca aaggtacgcg tccactcacg ctaaattttg accagcagtg ctggcagcct IBO 

gcagatgcga taaaactcaa tcagatgctt tccctgcaac cttgtagcaa cacgccgcct 240 

10 caatggcgat tgttcaggga cggcaaatat acgctgcaaa tagaoacccg ctccggtacg 300 

ccaacattga tgatttccat ccagaacgcc gccgaaccgg tagcaaacct ggtccgtgaa 3 60 

tgcccgaaat gggatggatt accgctcacg ctggatgtca gcgccacttt cccggaagga 420 

gccgccgtac gggattatta cagccagcaa attgcgatag tgaagaacgg tcaaataacg 480 

ttacaacccg ctgctaccag caacggttta. ctcctgctgg aacgggcaga aactgacgcc 540 

15 tctgcGcctt tcgactggca taacgccacg gtttactttg tgctgacaga tcgtttcgaa 600 
aacggcgatc ccagtaatga ccagagttac ggacgtcata aagacggtat ggcggaaatt 660 
ggcacttttc acggcggcga tttacgcggc ctgaccaaca aactggatta cctccagcag 72 0 

ttgggcgtta atgctttatg gataagcgcc ccatttgagc aaattcacgg ctgggtcggc 780 
ggcggtacaa aaggcgattt cccgcattat gcctaccacg gttattacac acaggactgg 840 

20 acgaatcttg atgccaatat gggcaacgaa gccgatctac ggacgctggt tgatagcgca 900 
catcagcgcg gtattcgtat tctctttgat gtcgtgatga accacaccgg ctatgccacg 960 

ctggcggata tgcaggagta tcagtttggc gcgttatatc tttctggtga cgaagtgaaa 1020 

aaaacgctgg gtgaacgctg gagcgactgg aaacctgccg ccgggcaaac ctggcatagc 10 80 

tttaacgatt acattaattt cagcgacaaa acaggctggg ataaatggtg gggaaaaaac 1140 

25 tggatccgta ccgatatcgg cgattacgac aatcctggat tcgacgatct caccatgtcg 12 00 

ctagcctttt tgccggatat caaaaccgaa tcaactaccg cttctggtct gccggtgttc 1260 

tataaaaaca aaacggatac ccacgctaaa gccatcgacg gctttacccc tcgcgattac 1320 

ttaacccact ggttaagtca gtgggtccgc gactatggga ttgatggttt tcgggtcgat 13 80 

accgccaaac atgttgagtt gcccgcttgg cagcaactga aaaccgaagc cagcgccgcg 1440 

30 cttcgcgaat ggaaaaaagc taaccccgac aaagcattag atgacaaacc tttctggatg 1500 

accggtgaag cctggggcca cggcgtgatg caaagtgact actatcgcca cggcttcgat 1560 

gcgatgatca atttcgatta tcaggagcag gcggcgaaag ctgtcgattg tattgcgcag 1620 

atggatacga cctggcagca aatggcggag aaattgcagg gtttcaacgt gttgagctac 1680 

ctctcgtcgc atgatacccg tctgttccgt gaagggggcg acaaagcagc agagttatta 1740 

35 ctattagcgc caggcgcggt acaaatcttt tatggcgatg aatcctcgcg tccgttcggt 1800 

cctacaggtt ctgatccgct gcaaggtaca cgttcggata tgaactggca ggatgttagc 1860 

ggtaaatctg ccgccaacgt cgcgcactgg cagaaaatca gccagttccg cgcccgccat 192 0 

cccgcaattg gcgcgggcaa acaaacgaca ctttcgctga agcagggcta cggctttgtt 1980 

cgtgagcatg gcgacgataa agtgctggtc atctgggctg ggcaacagtg a 2031 

40 

<212> Type : DMA 
<211> Length : 2031 

SequenceNaxtie : SBQ ID 425 

SequenceDe script ion : 

45 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

50 atgccacaac gacaccacca gggacataaa cgcacaccga aacagttggc gctcattatc 60 

aaacgctgtt tgccgatggt gctcactggc agcggcatgc tttgcactac cgctaacgcc 120 

gaagagtatt atttcgaccc cattatgctg gaaaccacaa aaagtggtat gcaaacaacc 180 

gatctgtcac gtttttcaaa aaaatacgca caactaccag gaacttatca ggttgatatc 240 

tggctgaata aaaagaaggt ttcacagaaa aaaattacat ttaccgccaa tgcagagcaa 3 00 

55 cttctgcagc cacagtttac ggtagaacaa ctacgtgagc tgggtattaa ggtggatgaa 360 

atcccggcgc tggctgaaaa agatgacgat agcgtgatca actcgcttga acaaatcatt 420 

cccggtacag ctgctgaatt tgatttcaat catcagcgac ttaatttgag cattccccaa 480 

attgcactgt accgtgatgc aagaggttac gtctcccctt ctcgttggga cgatggtata 540 

ccaacgctgt ttaccaacta ctcgtttaca ggttctgata accgttaccg ccagggcaat 600 

60 cgtagccaac gacagtacct aaatatgcaa aatggtgcca attttggccc ctggcgatta 660 

cgtaactatt ctacgtggac acgcaacgat caggcgtcaa gctggaacac tatcagtagt 720 

tatttacaac gtgatatcaa ggcgttgaag tctcagttgc ttctgggaga aagcgccacc 780 

agcggcagta ttttttccag ctacaacttt actggcgtgc aactcgcttc cgacgataat 840 

atgttgccaa acagccagcg cggatttgcc ccaacggtac gcggtatcgc aaacagtagt 900 

65 gcaatcgtga ctatcaggca aaatggttat gtgatctatc aaagcaacgt gccagcgggt 960 

gcctttgaaa ttaacgatct ctacccctct tccaacagcg gcgatttaga agtcacgatt 1020 

gaagaaagtg acggtacgca acgtcgcttt atccagcctt attcttcatt acccatgatg 1080 
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cagcgacctg ggcatctaaa atatagcgcg accgctggac gctatcgcgc tgatgcaaac 1140 

agtgatagca aggaacccga atfctgctgaa gccacggcaa tatatggttt gaataatact 120 0 

tttacgctgt atggcggcct gctcggttct gaagatfcatt atgcgctggg gatcggtatc 1260 

ggcggcacac ttggcgcact gggcgcgttg tcgatggata tcaacagagc tgacacccaa 1320 

5 ttcgataacc agcactcttt tcatggctat caatggcgta cgcagtacat caaagatatc 1380 

ccggaaacca acaccaatat cgctgtcagc tactatcgct ataccaacga tggctatttt 1440 

agttttgatg aagccaatac ccgcaattgg gactataaca gtcgccaaaa aagtgaaatt 1500 

caattcaaca tcagccagac aatafcttgat ggggtaagtc tgtatgcctc cggttcacag 15 60 

caagactatt ggggcaataa cgagaaaaac aggaatatct ctgttggggt ttccggccag 162 0 

10 caatggggaa ttggttacag cctgaattat caatacagcc gctacactga tcaaaataat 1680 

gaccgcgcac tctctttgaa tctcagtatt ccgttagaac gctggttacc gcgtagccgg 1740 

gtttcctatc agatgaccag ccagaaagat cgcccaaccc aacatgaaat gcgtcttgat 180 0 

ggctcactgc tggatgatgg tcgcctgagc tatagtctgg aacaaagtct ggatgacgat 18 60 

. ' aacaaccata acagtagcgt gaacgccagt taccgttcac cttatggaac cttcagtgcc 1920 

15 ggatacagtt acggtaatga cagtagccaa tacaattacg gcgttaccgg cggcgtggtt 19 8 0 

atccatcctc atggtgtgac gctctcgcaa tatctgggca acgcttttgc gcttattgat 2040 

gctaacgggg catctggcgt gaggatacaa aactatccgg ggattgctac tgatcccttt 2100 

ggctatgcag tggttcctta tctcacaact tatcaggaaa accgtctctc ggtagatact 2160 

acgcagctgc ccgataacgt cgatcttgaa caaacaacac agtttgtggt gcccaacaga 222 0 

20 ggtgcaatgg tagcggcgcg tttcaacgcc aatatcggtt atcgcgtact tgttacagtc 22 80 

agcgatcgca acggtaaacc gttgcccttt ggcgctcttg ccagcaacga tgatacgggg 2340 

caacaaagta tcgtcgatga gggcggcata ctatatctct ctgggatatc gagtaaatca 240 0 

caaagctgga ctgtacgctg gggaaatcag gcagatcaac aatgtcagtt tgcttttagt 2460 

acaccggatt cagaaccaac aacctctgta ttacaaggca cagcgcagtg ccattaa 2517 

25 

<212> Type : DNA 
<211> Length : 2517 

SequenceName : SEQ ID 426 

SequenceDescription : 

30 

Sequence 



<213> OrganismName : Escherichia coli 0157:H7 
c400> PreSecjuenceString : 

35 atgatgttca gaaatagaat attactaata tttatattgt gggctaattt tacctgggct 60 

gggtgtcgta ctactgcatc attaaatatt acagatggta ttaatgttgg ggagatttta 120 

gcgaatgaaa cttcctttag taaaagtgtc gtgtttactg ggatatcttg tgatacgagc 180 

acggataaaa tagttfcataa aaatatccaa agtgattggg ttgaagttgg gccttttggt 240 

aatggcgaaa aattaaaggt taaaatagag tctttaggta aaaccagcga cacaattggg 300 

40 aaatccagca atgcgcaggc agtattacct tatgtggtta aaatagccag aggcacacct 360 

gattttactg gagaaagaaa atctacctgg tttatttcag ataccgtgat tgcaaatatt 420 

ggcggtgagt catcgtcatc catcgatttt tggttgggta tttgtaaggc attgaagttt 480 

aactggtgtg tgaattatct. caccagcaaa ctggcggggg atacatttac gcttgggtta 540 

aatatttcct attatcctaa aaatacgacc tgtaagcctg aaaacaccgt tataaaagta 600 

45 gatgatatcg ccttgttcca gctcagaaat cagggaaaga ttgcggcgaa cagtaaggaa 660 

ggaacaatta cgttgaaatg tgataatctt ttcggcgaca aaaaacaagc atcgcggaat 720 

atggttgtat atctttctag cagtgactta gttaaaggaa gtaatactat tttgcgtggt 78 0 

aaaacagata atggtgtagg gtttgtgttg gatctaacag aaccaccaaa agggactgag 840 

gctgccatta aaatttcggc caacggcgat cagggcgcgg cgacatcatt atggaaaaca 900 

50 gataaaccag gagtttcatt aaatagcaac attattaata taccagtcat ggccagttac 960 

tatgtatatg atgaaaaaaa agttaaatct ggcgcactgg aagcaaccgc attaatcaac 1020 

gtgaaatacg attaa 1035 
<212> Type : DNA 
<211> Length : 1035 

55 SequenceName : SEQ ID 427 

SequenceDescription : 

Sequence 



60 <213> OrganismName : Escherichia coli 0157:H7 
<400> PreSequenceString : 

atgattaaaa aagcttcgct gctgacggcg tgttctgtca cagccttttc cgcttgggca 60 

caggatacca gcccggatac tctcgtcgtt actgctaacc gttttgaaca gccgcgcagc 120 

actgtgcttg caccaaccac cgttgtgacg cgtcaggata tcgaccgctg gcagtcgacc 180 

65 tcggttaatg atgtgctgcg ccgtcttccg ggcgtcgata tcacccaaaa cggcggttca 240 

ggtcagctct catctatttt tattcgcggt acaaatgcca gtcatgtgtt ggtgttaatt 3 00 

gatggcgtac gcctgaatct ggcggggggg agtggttctg ccgaccttag ccagttccct 360 
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attgcgcttg tccagcgtgt tgaatatatc 
gatgcaatag gtggggtggt gaatatcatc 
tcagcagggt ggggaagcaa tagttatcaa 
ggggataaga cacgagtaac gttgttgggc 
gttgcctatg gtaataccgg aacgcaagcg 
acgctttatg gcgcgctgga gcataacttt 
tatggctatg ataaccgtac caattatgac 
gatacccgta aactctatag tcaaagttgg 
attaaatcac aactcattac cagctatagc 
tatggtcgtt atgattcgtc ggcgacgctc 
gcaaacaaca teat cat tgg ccacggtaat 
agcacggcac cgggcacagc ttatgttaag 
tatctgaccg ggctgcaaca agtcggcgat 
gataactcac agtttggtcg .tcatggaacc 
gaaggttatc gcttcattgc ttcctacggg 
ctgtatggct tctacggaaa tccgaatctg 
gcgtttgaag gcttaaccgc tggggtgaac 
agtgacttga tcgattatga tgatcacacc 
attaagggcg tcgaggcgac cgccaatttt 
tatgattatg tcgatgcgcg caatgcaatt 
cagcaggtga aataccagct cgactggcag 
cagtatttag gcactcgcta tgataaggat 
atgggcggtg tgagcttgtg ggatcttgcg 
gttcgtggta aaatagccaa cctgttcgac 
actgcaggac gggaatacac cttgtctggc 
<212> Type : DNA 
<211> Length : 1845 

SequenceUame : SEQ ID 428 
SequenceDescription : 

Sequence 



cgtgggccac 
acgacgcgcg 
aactatgatg 
gattatgccc 
cagccagata 
actgatgcct 
gcgtattatt 
gacgccgggc 
catagcaaag 
gatgagatga 
gttggtgcgg 
gatggatatg 
tttacctttg 
tggcaaacc.a 
acatcttata 
gacccggaga 
tggcgtattt 
ctgaaatatt 
gataccggac 
accgacacgc 
ttgtatgact 
tactcatctt 
gttgcgtatc 
aaagattatg 
agctacacct 



gctccgccgt 


ttatggttcc 


420 


atgaacccgg 


aacggaaatt 


480 


tctctacaca 


gcaacaactg 


540 


atactcatgg 


ttatgatgtt 


600 


acgatggttt 


tttaagtaaa 


660 


ggagcggctt 


tgtgcgcggc 


720 


ctccgggttc 


accattggtc 


780 


tgcgatataa 


cggcgaactg 


840 


attacaacta 


cgatccccat 


900 


agcaatacac 


cgtccagtgg 


960 


gtgttgactg 


gcagaagcag 


1020 


atcaacgtaa 


taccggcatc 


1080 


aaggcgcagc 


acgcagcgac 


1140 


gcgccggttg 


ggaattcatG 


1200 


aggcaccaaa 


tctggggcaa 


1260 


aaagcaaaca 


gtgggaaggc 


1320 


ccggatatcg 


taacgatgtc 


1380 


acaacgaagg 


gaaagcgcgg 


1440 


cacfcgacgca 


tactgtgagt 


1500 


cgttgttacg 


ccgtgctaaa 


1560 


tcgactgggg 


tattacttat 


1620 


atccttatca 


aaccgttaaa 


1680 


cggtcacctc 


tcacctgaca 


1740 


agacagtcta 


tggctaccaa 


1800 


tctga 




1845 



<213> OrganismName : Escherichia 
<400> PreSeq[uenc est ring : 
atgaaaaaca aattgttatt tatgatgtta 
gcagcaggtt atgatttagc taattcagaa 
tcttcattta atcaggcagc cataattggt 
cggcagggag gctcaaaact tttggcggtt 
aagattgacc agacaggaga ttataacctt 
gatgccagta tttcgcaagg tgcttatggt 
ggtaataaag caaatattac acagtatggt 
cagtcgcaaa tggctattcg cgtgacacaa 
<212> Type : DNA 
<211> Length : 456 

SequenceNarae : SEQ ID 429 
SequenceDescription : 



coli 0157:H7 

acaatactgg gtgcgcctgg 
tataacttcg cggtaaatga 
caagctggga ctaataatag 
gttgcgcaag aaggtagtag 
gcatatattg atcaggcggg 
aatactgcga tgattatcca 
actcaaaaaa cggcaattgt 
cgttaa 



gattgcagcc 
attgagtaag 
tgctcagtta 
caaccgggca 
cagtgccaat 
gaaaggttct 
agtgcagaga 



60 
120 
180 
240 
300 
360 
420 
456 



Sequence 

50 <213> OrganismName : Escherichia coli 0157:H7 
<400> PreSeqpaenceString : 

atgaacattt ttgcatattt actggtactt gtattttcca tgagcatgag cagcagcgcg 60 
tttgccagcg tggtaatgac cggaacccgt attattttcc ctggtgacgc aaaggaaaaa 12 0 

accatccagt tgcgaaatac cagcgatcag ccctatatca ttaatatcca tgttgaggat 180 

55 gaacgtggtt ctgacaagaa tgtaccgttt atgccaaccc cgcagacatt tcgcatggaa 240 
gctgccgcag gtcaggcgtt acgcctgctc tacactggta ataatttacc gcaggatcgc 3 00 

gagtctgttt tctggtttag tttcagtcaa ctaccttatc tgaataagaa tgataaaagt . 360 
cagaaccagc tcatcctggc cctgactaat cgagtcaaaa ttttctatcg tcccagctcg 420 
attgtcggta aatccagtga cgcacccaaa aacctgactt accaggtaaa acagaaccgc 480 

60 attgaagtga cgaatcccac gggctattac gtcacaattc gcgccgctga actgcttaat 540 
aatggtaaaa aagtccccct cgcgaattcg gtaatgattg ctcctcaaag cacaactgaa 600 
tggacactac cctctggcat cagtgtcgct cccggtgcgc agatccattt agtgaccgtc 660 
aacgactatg gcgtaaatgfc tacgtctgag catgccttat aa 702 
<212> Type : DNA 

65 <211> Length : 702 

SequenceName : SEQ ID 430 
SequenceDescription : 



